[GitHub] [spark] cloud-fan commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
cloud-fan commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563521153 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/util/PartitioningUtils.scala ## @@ -30,14 +35,33 @@ object PartitioningUtils { */

[GitHub] [spark] JkSelf commented on a change in pull request #31258: [SPARK-34168] [SQL] Support DPP in AQE when the join is Broadcast hash join at the beginning

2021-01-24 Thread GitBox
JkSelf commented on a change in pull request #31258: URL: https://github.com/apache/spark/pull/31258#discussion_r563520715 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala ## @@ -133,7 +133,7 @@ case class AdaptiveSparkP

[GitHub] [spark] akiyamaneko commented on pull request #31314: [SPARK-34221][WEBUI] Ensure if a stage fails in the UI page, the corresponding error message can be displayed correctly.

2021-01-24 Thread GitBox
akiyamaneko commented on pull request #31314: URL: https://github.com/apache/spark/pull/31314#issuecomment-766622549 @sarutak thanks for your review. I forgot that in the Java exception stack, `at` can be exists. My code doesn't handle it very well, so what do you think is appropriate t

[GitHub] [spark] yaooqinn commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
yaooqinn commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563513856 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/HiveCharVarcharTestSuite.scala ## @@ -50,6 +51,35 @@ class HiveCharVarcharTestSuite extends C

[GitHub] [spark] SparkQA commented on pull request #31312: [SPARK-33542][SQL][FOLLOWUP] Group exception messages in catalyst/catalog

2021-01-24 Thread GitBox
SparkQA commented on pull request #31312: URL: https://github.com/apache/spark/pull/31312#issuecomment-766615508 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39009/ ---

[GitHub] [spark] cloud-fan closed pull request #31265: [SPARK-34197][SQL] `SessionCatalog.refreshTable()` should not invalidate the relation cache for temporary views

2021-01-24 Thread GitBox
cloud-fan closed pull request #31265: URL: https://github.com/apache/spark/pull/31265 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [spark] cloud-fan commented on pull request #31265: [SPARK-34197][SQL] `SessionCatalog.refreshTable()` should not invalidate the relation cache for temporary views

2021-01-24 Thread GitBox
cloud-fan commented on pull request #31265: URL: https://github.com/apache/spark/pull/31265#issuecomment-766613665 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [spark] MaxGekk commented on a change in pull request #31265: [SPARK-34197][SQL] `SessionCatalog.refreshTable()` should not invalidate the relation cache for temporary views

2021-01-24 Thread GitBox
MaxGekk commented on a change in pull request #31265: URL: https://github.com/apache/spark/pull/31265#discussion_r563509932 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala ## @@ -1680,4 +1680,20 @@ abstract class Sessio

[GitHub] [spark] cloud-fan commented on a change in pull request #31273: [WIP][Spark-34152][SQL] Make CreateViewStatement.child to be LogicalPlan's children so that it's resolved in analyze phase

2021-01-24 Thread GitBox
cloud-fan commented on a change in pull request #31273: URL: https://github.com/apache/spark/pull/31273#discussion_r563510829 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ## @@ -558,20 +560,27 @@ object ViewHelper { catalog: S

[GitHub] [spark] cloud-fan commented on a change in pull request #31273: [WIP][Spark-34152][SQL] Make CreateViewStatement.child to be LogicalPlan's children so that it's resolved in analyze phase

2021-01-24 Thread GitBox
cloud-fan commented on a change in pull request #31273: URL: https://github.com/apache/spark/pull/31273#discussion_r563510450 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ## @@ -558,20 +560,27 @@ object ViewHelper { catalog: S

[GitHub] [spark] SparkQA commented on pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
SparkQA commented on pull request #31281: URL: https://github.com/apache/spark/pull/31281#issuecomment-766611889 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39007/ ---

[GitHub] [spark] MaxGekk commented on a change in pull request #31265: [SPARK-34197][SQL] `SessionCatalog.refreshTable()` should not invalidate the relation cache for temporary views

2021-01-24 Thread GitBox
MaxGekk commented on a change in pull request #31265: URL: https://github.com/apache/spark/pull/31265#discussion_r563509932 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala ## @@ -1680,4 +1680,20 @@ abstract class Sessio

[GitHub] [spark] sarutak commented on pull request #31314: [SPARK-34221][WEBUI] Ensure if a stage fails in the UI page, the corresponding error message can be displayed correctly.

2021-01-24 Thread GitBox
sarutak commented on pull request #31314: URL: https://github.com/apache/spark/pull/31314#issuecomment-766610305 @akiyamaneko Thanks for the contribution. I tried applying your change and I noticed if the `TaskEndReason` is `ExceptionFailure`, your change breaks the appearance of the heade

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31313: [SPARK-34220][ML] BucketedRandomProjectionLSH transform optimization

2021-01-24 Thread GitBox
AmplabJenkins removed a comment on pull request #31313: URL: https://github.com/apache/spark/pull/31313#issuecomment-766605420 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39006/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30788: [SPARK-33726][SQL] Fix for Duplicate field names during Aggregation

2021-01-24 Thread GitBox
AmplabJenkins removed a comment on pull request #30788: URL: https://github.com/apache/spark/pull/30788#issuecomment-766605416 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134418/ -

[GitHub] [spark] Ngone51 edited a comment on pull request #31298: [SPARK-34193][CORE] TorrentBroadcast block manager decommissioning race fix

2021-01-24 Thread GitBox
Ngone51 edited a comment on pull request #31298: URL: https://github.com/apache/spark/pull/31298#issuecomment-766596881 ``` scala [info] org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 3.0 failed 4 times, most recent failure: Lost task 1.3 in stage

[GitHub] [spark] SparkQA commented on pull request #31296: [SPARK-34205][SQL][SS] Add pipe to Dataset to enable Streaming Dataset pipe

2021-01-24 Thread GitBox
SparkQA commented on pull request #31296: URL: https://github.com/apache/spark/pull/31296#issuecomment-766606549 **[Test build #134424 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134424/testReport)** for PR 31296 at commit [`45bd4a7`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #31314: [SPARK-34221][WEBUI] Ensure if a stage fails in the UI page, the corresponding error message can be displayed correctly.

2021-01-24 Thread GitBox
SparkQA commented on pull request #31314: URL: https://github.com/apache/spark/pull/31314#issuecomment-766606468 **[Test build #134423 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134423/testReport)** for PR 31314 at commit [`529ebb0`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #31313: [SPARK-34220][ML] BucketedRandomProjectionLSH transform optimization

2021-01-24 Thread GitBox
AmplabJenkins commented on pull request #31313: URL: https://github.com/apache/spark/pull/31313#issuecomment-766605420 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39006/ -

[GitHub] [spark] AmplabJenkins commented on pull request #30788: [SPARK-33726][SQL] Fix for Duplicate field names during Aggregation

2021-01-24 Thread GitBox
AmplabJenkins commented on pull request #30788: URL: https://github.com/apache/spark/pull/30788#issuecomment-766605416 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134418/ -

[GitHub] [spark] yaooqinn commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
yaooqinn commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563503267 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -988,7 +989,15 @@ private[hive] class HiveClientImpl(

[GitHub] [spark] SparkQA commented on pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
SparkQA commented on pull request #31281: URL: https://github.com/apache/spark/pull/31281#issuecomment-766602770 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39007/ -

[GitHub] [spark] SparkQA commented on pull request #31312: [SPARK-33542][SQL][FOLLOWUP] Group exception messages in catalyst/catalog

2021-01-24 Thread GitBox
SparkQA commented on pull request #31312: URL: https://github.com/apache/spark/pull/31312#issuecomment-766601428 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39009/ -

[GitHub] [spark] Ngone51 commented on pull request #31298: [SPARK-34193][CORE] TorrentBroadcast block manager decommissioning race fix

2021-01-24 Thread GitBox
Ngone51 commented on pull request #31298: URL: https://github.com/apache/spark/pull/31298#issuecomment-766596881 ``` scala [info] org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 3.0 failed 4 times, most recent failure: Lost task 1.3 in stage 3.0 (T

[GitHub] [spark] cloud-fan commented on a change in pull request #31265: [SPARK-34197][SQL] `SessionCatalog.refreshTable()` should not invalidate the relation cache for temporary views

2021-01-24 Thread GitBox
cloud-fan commented on a change in pull request #31265: URL: https://github.com/apache/spark/pull/31265#discussion_r563498712 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala ## @@ -1680,4 +1680,20 @@ abstract class Sess

[GitHub] [spark] cloud-fan commented on a change in pull request #31308: [SPARK-34215][SQL] Keep tables cached after truncation

2021-01-24 Thread GitBox
cloud-fan commented on a change in pull request #31308: URL: https://github.com/apache/spark/pull/31308#discussion_r563497900 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ## @@ -561,16 +561,9 @@ case class TruncateTableCommand(

[GitHub] [spark] yaooqinn commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
yaooqinn commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563496408 ## File path: sql/core/src/test/scala/org/apache/spark/sql/CharVarcharTestSuite.scala ## @@ -37,31 +37,109 @@ trait CharVarcharTestSuite extends QueryTe

[GitHub] [spark] cloud-fan commented on pull request #30788: [SPARK-33726][SQL] Fix for Duplicate field names during Aggregation

2021-01-24 Thread GitBox
cloud-fan commented on pull request #30788: URL: https://github.com/apache/spark/pull/30788#issuecomment-766591390 thanks, merging to master/3.1/3.0! @yliou can you open a backport PR for 2.4 as it has code conflicts? thanks! --

[GitHub] [spark] cloud-fan closed pull request #30788: [SPARK-33726][SQL] Fix for Duplicate field names during Aggregation

2021-01-24 Thread GitBox
cloud-fan closed pull request #30788: URL: https://github.com/apache/spark/pull/30788 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [spark] SparkQA commented on pull request #31313: [SPARK-34220][ML] BucketedRandomProjectionLSH transform optimization

2021-01-24 Thread GitBox
SparkQA commented on pull request #31313: URL: https://github.com/apache/spark/pull/31313#issuecomment-766590311 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39006/ ---

[GitHub] [spark] cloud-fan commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
cloud-fan commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563493450 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/HiveCharVarcharTestSuite.scala ## @@ -50,6 +51,35 @@ class HiveCharVarcharTestSuite extends

[GitHub] [spark] viirya edited a comment on pull request #31296: [SPARK-34205][SQL][SS] Add pipe to Dataset to enable Streaming Dataset pipe

2021-01-24 Thread GitBox
viirya edited a comment on pull request #31296: URL: https://github.com/apache/spark/pull/31296#issuecomment-766568619 Hmm, I'm fine if you think we should always require a custom function to produce the output. This is an

[GitHub] [spark] yaooqinn commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
yaooqinn commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563493183 ## File path: sql/core/src/test/scala/org/apache/spark/sql/CharVarcharTestSuite.scala ## @@ -37,31 +37,109 @@ trait CharVarcharTestSuite extends QueryTe

[GitHub] [spark] cloud-fan commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
cloud-fan commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563492926 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/HiveCharVarcharTestSuite.scala ## @@ -50,6 +51,35 @@ class HiveCharVarcharTestSuite extends

[GitHub] [spark] cloud-fan commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
cloud-fan commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563492693 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/HiveCharVarcharTestSuite.scala ## @@ -50,6 +51,35 @@ class HiveCharVarcharTestSuite extends

[GitHub] [spark] cloud-fan commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
cloud-fan commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563491756 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -988,7 +989,15 @@ private[hive] class HiveClientImpl(

[GitHub] [spark] cloud-fan commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
cloud-fan commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563491086 ## File path: sql/core/src/test/scala/org/apache/spark/sql/CharVarcharTestSuite.scala ## @@ -37,31 +37,109 @@ trait CharVarcharTestSuite extends QueryT

[GitHub] [spark] cloud-fan commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
cloud-fan commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563490943 ## File path: sql/core/src/test/scala/org/apache/spark/sql/CharVarcharTestSuite.scala ## @@ -37,31 +37,109 @@ trait CharVarcharTestSuite extends QueryT

[GitHub] [spark] yaooqinn commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
yaooqinn commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563490566 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/util/PartitioningUtils.scala ## @@ -30,14 +35,33 @@ object PartitioningUtils { */

[GitHub] [spark] cloud-fan commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
cloud-fan commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563490346 ## File path: sql/core/src/test/scala/org/apache/spark/sql/CharVarcharTestSuite.scala ## @@ -37,31 +37,109 @@ trait CharVarcharTestSuite extends QueryT

[GitHub] [spark] cloud-fan commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
cloud-fan commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563490099 ## File path: sql/core/src/test/scala/org/apache/spark/sql/CharVarcharTestSuite.scala ## @@ -37,31 +37,109 @@ trait CharVarcharTestSuite extends QueryT

[GitHub] [spark] SparkQA commented on pull request #31312: [SPARK-33542][SQL][FOLLOWUP] Group exception messages in catalyst/catalog

2021-01-24 Thread GitBox
SparkQA commented on pull request #31312: URL: https://github.com/apache/spark/pull/31312#issuecomment-766582896 **[Test build #134422 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134422/testReport)** for PR 31312 at commit [`4f65e4c`](https://github.com

[GitHub] [spark] cloud-fan commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
cloud-fan commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563489429 ## File path: sql/core/src/test/scala/org/apache/spark/sql/CharVarcharTestSuite.scala ## @@ -37,31 +37,109 @@ trait CharVarcharTestSuite extends QueryT

[GitHub] [spark] SparkQA removed a comment on pull request #30788: [SPARK-33726][SQL] Fix for Duplicate field names during Aggregation

2021-01-24 Thread GitBox
SparkQA removed a comment on pull request #30788: URL: https://github.com/apache/spark/pull/30788#issuecomment-766498102 **[Test build #134418 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134418/testReport)** for PR 30788 at commit [`c6e036e`](https://gi

[GitHub] [spark] cloud-fan commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
cloud-fan commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563488635 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ## @@ -521,19 +521,19 @@ case class AlterTableRenamePartitionCom

[GitHub] [spark] cloud-fan commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
cloud-fan commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563488581 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ## @@ -465,13 +465,13 @@ case class AlterTableAddPartitionComman

[GitHub] [spark] cloud-fan commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
cloud-fan commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563488313 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/util/PartitioningUtils.scala ## @@ -30,14 +35,33 @@ object PartitioningUtils { */

[GitHub] [spark] cloud-fan commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
cloud-fan commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563488269 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/util/PartitioningUtils.scala ## @@ -30,14 +35,33 @@ object PartitioningUtils { */

[GitHub] [spark] SparkQA commented on pull request #30788: [SPARK-33726][SQL] Fix for Duplicate field names during Aggregation

2021-01-24 Thread GitBox
SparkQA commented on pull request #30788: URL: https://github.com/apache/spark/pull/30788#issuecomment-766580429 **[Test build #134418 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134418/testReport)** for PR 30788 at commit [`c6e036e`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
SparkQA commented on pull request #31281: URL: https://github.com/apache/spark/pull/31281#issuecomment-766579761 **[Test build #134421 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134421/testReport)** for PR 31281 at commit [`6f19b5e`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31314: [SPARK-34221][WEBUI] Ensure if a stage fails in the UI page, the corresponding error message can be displayed correctly.

2021-01-24 Thread GitBox
AmplabJenkins removed a comment on pull request #31314: URL: https://github.com/apache/spark/pull/31314#issuecomment-766548136 Can one of the admins verify this patch? This is an automated message from the Apache Git Service.

[GitHub] [spark] sarutak commented on pull request #31314: [SPARK-34221][WEBUI] Ensure if a stage fails in the UI page, the corresponding error message can be displayed correctly.

2021-01-24 Thread GitBox
sarutak commented on pull request #31314: URL: https://github.com/apache/spark/pull/31314#issuecomment-766578766 Jenkins, add to whitelist. This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] MaxGekk commented on pull request #31308: [SPARK-34215][SQL] Keep tables cached after truncation

2021-01-24 Thread GitBox
MaxGekk commented on pull request #31308: URL: https://github.com/apache/spark/pull/31308#issuecomment-766575636 > BTW, do we have a documentation for caching behaviour? @HyukjinKwon Probably, not. At least I don't know where there are such docs. > it would be great to have one

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31312: [SPARK-33542][SQL][FOLLOWUP] Group exception messages in catalyst/catalog

2021-01-24 Thread GitBox
AmplabJenkins removed a comment on pull request #31312: URL: https://github.com/apache/spark/pull/31312#issuecomment-766574516 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39005/

[GitHub] [spark] AmplabJenkins commented on pull request #31312: [SPARK-33542][SQL][FOLLOWUP] Group exception messages in catalyst/catalog

2021-01-24 Thread GitBox
AmplabJenkins commented on pull request #31312: URL: https://github.com/apache/spark/pull/31312#issuecomment-766574516 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39005/ -

[GitHub] [spark] MaxGekk commented on a change in pull request #31308: [SPARK-34215][SQL] Keep tables cached after truncation

2021-01-24 Thread GitBox
MaxGekk commented on a change in pull request #31308: URL: https://github.com/apache/spark/pull/31308#discussion_r563484494 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ## @@ -561,16 +561,9 @@ case class TruncateTableCommand(

[GitHub] [spark] yaooqinn commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
yaooqinn commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563483203 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/util/PartitioningUtils.scala ## @@ -30,14 +35,34 @@ object PartitioningUtils { */

[GitHub] [spark] viirya commented on pull request #31296: [SPARK-34205][SQL][SS] Add pipe to Dataset to enable Streaming Dataset pipe

2021-01-24 Thread GitBox
viirya commented on pull request #31296: URL: https://github.com/apache/spark/pull/31296#issuecomment-766568619 > OK so you seem to agree default serializer doesn't work for untyped. And I also think we agree default serializer is problematic for non-primitive type T on typed. These cases

[GitHub] [spark] cloud-fan commented on a change in pull request #31281: [SPARK-34192][SQL] Move char padding to write side and remove length check on read side too

2021-01-24 Thread GitBox
cloud-fan commented on a change in pull request #31281: URL: https://github.com/apache/spark/pull/31281#discussion_r563478821 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/util/PartitioningUtils.scala ## @@ -30,14 +35,34 @@ object PartitioningUtils { */

[GitHub] [spark] HeartSaVioR commented on pull request #31296: [SPARK-34205][SQL][SS] Add pipe to Dataset to enable Streaming Dataset pipe

2021-01-24 Thread GitBox
HeartSaVioR commented on pull request #31296: URL: https://github.com/apache/spark/pull/31296#issuecomment-766562418 OK so you seem to agree default serializer doesn't work for untyped. And I also think we agree default serializer is problematic for non-primitive type T on typed. These cas

[GitHub] [spark] zhengruifeng commented on pull request #31313: [SPARK-34220][ML] BucketedRandomProjectionLSH transform optimization

2021-01-24 Thread GitBox
zhengruifeng commented on pull request #31313: URL: https://github.com/apache/spark/pull/31313#issuecomment-766561560 friendly ping @srowen This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] SparkQA commented on pull request #31312: [SPARK-33542][SQL][FOLLOWUP] Group exception messages in catalyst/catalog

2021-01-24 Thread GitBox
SparkQA commented on pull request #31312: URL: https://github.com/apache/spark/pull/31312#issuecomment-766556768 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39005/ ---

[GitHub] [spark] viirya commented on pull request #31296: [SPARK-34205][SQL][SS] Add pipe to Dataset to enable Streaming Dataset pipe

2021-01-24 Thread GitBox
viirya commented on pull request #31296: URL: https://github.com/apache/spark/pull/31296#issuecomment-766554211 > The problem is other typed functions get the Row as simply `Row`, and able to call `Row.getString` or something like that, even knowing which columns the Row instance has. It d

[GitHub] [spark] SparkQA removed a comment on pull request #31313: [SPARK-34220][ML] BucketedRandomProjectionLSH transform optimization

2021-01-24 Thread GitBox
SparkQA removed a comment on pull request #31313: URL: https://github.com/apache/spark/pull/31313#issuecomment-766530270 **[Test build #134419 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134419/testReport)** for PR 31313 at commit [`919be7b`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31313: [SPARK-34220][ML] BucketedRandomProjectionLSH transform optimization

2021-01-24 Thread GitBox
AmplabJenkins removed a comment on pull request #31313: URL: https://github.com/apache/spark/pull/31313#issuecomment-766549560 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134419/ -

[GitHub] [spark] AmplabJenkins commented on pull request #31313: [SPARK-34220][ML] BucketedRandomProjectionLSH transform optimization

2021-01-24 Thread GitBox
AmplabJenkins commented on pull request #31313: URL: https://github.com/apache/spark/pull/31313#issuecomment-766549560 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134419/ -

[GitHub] [spark] SparkQA commented on pull request #31313: [SPARK-34220][ML] BucketedRandomProjectionLSH transform optimization

2021-01-24 Thread GitBox
SparkQA commented on pull request #31313: URL: https://github.com/apache/spark/pull/31313#issuecomment-766549223 **[Test build #134419 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134419/testReport)** for PR 31313 at commit [`919be7b`](https://github.co

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31312: [SPARK-33542][SQL][FOLLOWUP] Group exception messages in catalyst/catalog

2021-01-24 Thread GitBox
AmplabJenkins removed a comment on pull request #31312: URL: https://github.com/apache/spark/pull/31312#issuecomment-766547763 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134420/ -

[GitHub] [spark] AmplabJenkins commented on pull request #31314: [SPARK-34221][WEBUI] Ensure if a stage fails in the UI page, the corresponding error message can be displayed correctly.

2021-01-24 Thread GitBox
AmplabJenkins commented on pull request #31314: URL: https://github.com/apache/spark/pull/31314#issuecomment-766548136 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To resp

[GitHub] [spark] AmplabJenkins commented on pull request #31312: [SPARK-33542][SQL][FOLLOWUP] Group exception messages in catalyst/catalog

2021-01-24 Thread GitBox
AmplabJenkins commented on pull request #31312: URL: https://github.com/apache/spark/pull/31312#issuecomment-766547763 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134420/ -

[GitHub] [spark] SparkQA commented on pull request #31313: [SPARK-34220][ML] BucketedRandomProjectionLSH transform optimization

2021-01-24 Thread GitBox
SparkQA commented on pull request #31313: URL: https://github.com/apache/spark/pull/31313#issuecomment-766547200 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39006/ -

[GitHub] [spark] SparkQA commented on pull request #31312: [SPARK-33542][SQL][FOLLOWUP] Group exception messages in catalyst/catalog

2021-01-24 Thread GitBox
SparkQA commented on pull request #31312: URL: https://github.com/apache/spark/pull/31312#issuecomment-766543681 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39005/ -

[GitHub] [spark] zhengruifeng commented on pull request #31279: [SPARK-33518][ML][FOLLOWUP] MatrixFactorizationModel use GEMV

2021-01-24 Thread GitBox
zhengruifeng commented on pull request #31279: URL: https://github.com/apache/spark/pull/31279#issuecomment-766543071 ping @srowen This is an automated message from the Apache Git Service. To respond to the message, please l

[GitHub] [spark] wangyum closed pull request #31303: [SPARK-34211][SQL][TESTS] Benchmark TPC-DS with 1GB scale factor

2021-01-24 Thread GitBox
wangyum closed pull request #31303: URL: https://github.com/apache/spark/pull/31303 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] wangyum commented on pull request #31303: [SPARK-34211][SQL][TESTS] Benchmark TPC-DS with 1GB scale factor

2021-01-24 Thread GitBox
wangyum commented on pull request #31303: URL: https://github.com/apache/spark/pull/31303#issuecomment-766540647 Close it. Thank you all. This is an automated message from the Apache Git Service. To respond to the message, pl

[GitHub] [spark] zhengruifeng commented on a change in pull request #31313: [SPARK-34220][ML] BucketedRandomProjectionLSH transform opt

2021-01-24 Thread GitBox
zhengruifeng commented on a change in pull request #31313: URL: https://github.com/apache/spark/pull/31313#discussion_r563463265 ## File path: mllib/src/main/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSH.scala ## @@ -60,14 +59,20 @@ private[ml] trait BucketedRa

[GitHub] [spark] cloud-fan closed pull request #31201: [SPARK-34133][AVRO] Respect case sensitivity when performing Catalyst-to-Avro field matching

2021-01-24 Thread GitBox
cloud-fan closed pull request #31201: URL: https://github.com/apache/spark/pull/31201 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [spark] cloud-fan commented on pull request #31201: [SPARK-34133][AVRO] Respect case sensitivity when performing Catalyst-to-Avro field matching

2021-01-24 Thread GitBox
cloud-fan commented on pull request #31201: URL: https://github.com/apache/spark/pull/31201#issuecomment-766539161 thanks, merging to master/3.1! This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [spark] akiyamaneko opened a new pull request #31314: [SPARK-34221][WEBUI]the error message of the stage in the UI page sometimes is blank

2021-01-24 Thread GitBox
akiyamaneko opened a new pull request #31314: URL: https://github.com/apache/spark/pull/31314 ### What changes were proposed in this pull request? Ensure that if a stage fails in the UI page, the corresponding error message can be displayed correctly. ### Why are the chang

[GitHub] [spark] cloud-fan commented on pull request #31303: [SPARK-34211][SQL][TESTS] Benchmark TPC-DS with 1GB scale factor

2021-01-24 Thread GitBox
cloud-fan commented on pull request #31303: URL: https://github.com/apache/spark/pull/31303#issuecomment-766537170 I agree that we can still track some big perf differences (like 10x slower) with Github actions, but running benchmarks in Github action takes a lot of resources and may not p

[GitHub] [spark] HeartSaVioR commented on pull request #31296: [SPARK-34205][SQL][SS] Add pipe to Dataset to enable Streaming Dataset pipe

2021-01-24 Thread GitBox
HeartSaVioR commented on pull request #31296: URL: https://github.com/apache/spark/pull/31296#issuecomment-766535285 > I'm not sure how much details you'd like to see? I'm quoting my comment: > That's why I want to hear the actual use case, what is the type of T Dataset, which

[GitHub] [spark] SparkQA removed a comment on pull request #31312: [SPARK-33542][SQL][FOLLOWUP] Group exception messages in catalyst/catalog

2021-01-24 Thread GitBox
SparkQA removed a comment on pull request #31312: URL: https://github.com/apache/spark/pull/31312#issuecomment-766533133 **[Test build #134420 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134420/testReport)** for PR 31312 at commit [`7d93936`](https://gi

[GitHub] [spark] cloud-fan commented on pull request #31269: [SPARK-33933][SQL] Materialize BroadcastQueryStage first to try to avoid broadcast timeout in AQE

2021-01-24 Thread GitBox
cloud-fan commented on pull request #31269: URL: https://github.com/apache/spark/pull/31269#issuecomment-766533886 It's more of an improvement, so usually we don't backport. This is an automated message from the Apache Git Se

[GitHub] [spark] SparkQA commented on pull request #31312: [SPARK-33542][SQL][FOLLOWUP] Group exception messages in catalyst/catalog

2021-01-24 Thread GitBox
SparkQA commented on pull request #31312: URL: https://github.com/apache/spark/pull/31312#issuecomment-766533645 **[Test build #134420 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134420/testReport)** for PR 31312 at commit [`7d93936`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #31312: [SPARK-33542][SQL][FOLLOWUP] Group exception messages in catalyst/catalog

2021-01-24 Thread GitBox
SparkQA commented on pull request #31312: URL: https://github.com/apache/spark/pull/31312#issuecomment-766533133 **[Test build #134420 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134420/testReport)** for PR 31312 at commit [`7d93936`](https://github.com

[GitHub] [spark] viirya edited a comment on pull request #31296: [SPARK-34205][SQL][SS] Add pipe to Dataset to enable Streaming Dataset pipe

2021-01-24 Thread GitBox
viirya edited a comment on pull request #31296: URL: https://github.com/apache/spark/pull/31296#issuecomment-766530579 > Is it too hard requirement to explain the actual use case, especially you've said you have internal customer claiming this feature? I don't think my request requires any

[GitHub] [spark] viirya edited a comment on pull request #31296: [SPARK-34205][SQL][SS] Add pipe to Dataset to enable Streaming Dataset pipe

2021-01-24 Thread GitBox
viirya edited a comment on pull request #31296: URL: https://github.com/apache/spark/pull/31296#issuecomment-766530579 > Is it too hard requirement to explain the actual use case, especially you've said you have internal customer claiming this feature? I don't think my request requires any

[GitHub] [spark] viirya commented on pull request #31296: [SPARK-34205][SQL][SS] Add pipe to Dataset to enable Streaming Dataset pipe

2021-01-24 Thread GitBox
viirya commented on pull request #31296: URL: https://github.com/apache/spark/pull/31296#issuecomment-766530579 > Is it too hard requirement to explain the actual use case, especially you've said you have internal customer claiming this feature? I don't think my request requires anything n

[GitHub] [spark] SparkQA commented on pull request #31313: [SPARK-34220][ML] BucketedRandomProjectionLSH transform opt

2021-01-24 Thread GitBox
SparkQA commented on pull request #31313: URL: https://github.com/apache/spark/pull/31313#issuecomment-766530270 **[Test build #134419 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134419/testReport)** for PR 31313 at commit [`919be7b`](https://github.com

[GitHub] [spark] zhengruifeng commented on pull request #31313: [SPARK-34220][ML] BucketedRandomProjectionLSH transform opt

2021-01-24 Thread GitBox
zhengruifeng commented on pull request #31313: URL: https://github.com/apache/spark/pull/31313#issuecomment-766530187 test code: ``` import org.apache.spark.ml.linalg._ import org.apache.spark.ml.feature._ import org.apache.spark.storage.StorageLevel val df = spark.re

[GitHub] [spark] zhengruifeng opened a new pull request #31313: [SPARK-34220][ML] BucketedRandomProjectionLSH transform opt

2021-01-24 Thread GitBox
zhengruifeng opened a new pull request #31313: URL: https://github.com/apache/spark/pull/31313 ### What changes were proposed in this pull request? use GEMV instead of DOT ### Why are the changes needed? 1, better performance, could be 20% faster than existing impl 2, simplif

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31311: [SPARK-34218][INFRA] Add Scala 2.13 packaging and publishing

2021-01-24 Thread GitBox
AmplabJenkins removed a comment on pull request #31311: URL: https://github.com/apache/spark/pull/31311#issuecomment-766529543 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #31311: [SPARK-34218][INFRA] Add Scala 2.13 packaging and publishing

2021-01-24 Thread GitBox
AmplabJenkins commented on pull request #31311: URL: https://github.com/apache/spark/pull/31311#issuecomment-766529543 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #31311: [SPARK-34218][INFRA] Add Scala 2.13 packaging and publishing

2021-01-24 Thread GitBox
SparkQA removed a comment on pull request #31311: URL: https://github.com/apache/spark/pull/31311#issuecomment-766484436 **[Test build #134417 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134417/testReport)** for PR 31311 at commit [`8e22c14`](https://gi

[GitHub] [spark] SparkQA commented on pull request #31311: [SPARK-34218][INFRA] Add Scala 2.13 packaging and publishing

2021-01-24 Thread GitBox
SparkQA commented on pull request #31311: URL: https://github.com/apache/spark/pull/31311#issuecomment-766521633 **[Test build #134417 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134417/testReport)** for PR 31311 at commit [`8e22c14`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #31311: [SPARK-34218][INFRA] Add Scala 2.13 packaging and publishing

2021-01-24 Thread GitBox
SparkQA commented on pull request #31311: URL: https://github.com/apache/spark/pull/31311#issuecomment-766521451 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39003/ ---

[GitHub] [spark] sunchao commented on a change in pull request #31308: [SPARK-34215][SQL] Keep tables cached after truncation

2021-01-24 Thread GitBox
sunchao commented on a change in pull request #31308: URL: https://github.com/apache/spark/pull/31308#discussion_r563448718 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ## @@ -561,16 +561,9 @@ case class TruncateTableCommand(

[GitHub] [spark] sunchao commented on a change in pull request #31308: [SPARK-34215][SQL] Keep tables cached after truncation

2021-01-24 Thread GitBox
sunchao commented on a change in pull request #31308: URL: https://github.com/apache/spark/pull/31308#discussion_r563448718 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ## @@ -561,16 +561,9 @@ case class TruncateTableCommand(

[GitHub] [spark] HeartSaVioR commented on pull request #31296: [SPARK-34205][SQL][SS] Add pipe to Dataset to enable Streaming Dataset pipe

2021-01-24 Thread GitBox
HeartSaVioR commented on pull request #31296: URL: https://github.com/apache/spark/pull/31296#issuecomment-766520145 Is it too hard requirement to explain the actual use case, especially you've said you have internal customer claiming this feature? I don't think my request requires anythin

[GitHub] [spark] beliefer opened a new pull request #31312: [SPARK-33542][SQL][FOLLOWUP] Group exception messages in catalyst/catalog

2021-01-24 Thread GitBox
beliefer opened a new pull request #31312: URL: https://github.com/apache/spark/pull/31312 ### What changes were proposed in this pull request? This PR follows up https://github.com/apache/spark/pull/30870. Maybe some contributors don't know the job and added some exception by the old

  1   2   3   >