[GitHub] [spark] yaooqinn commented on a change in pull request #31921: [SPARK-34817][SQL] Read parquet unsigned types that stored as int32 physical type in parquet

2021-03-21 Thread GitBox
yaooqinn commented on a change in pull request #31921: URL: https://github.com/apache/spark/pull/31921#discussion_r598463339 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala ## @@ -130,13 +130,11 @@ class Par

[GitHub] [spark] yaooqinn commented on a change in pull request #31921: [SPARK-34817][SQL] Read parquet unsigned types that stored as int32 physical type in parquet

2021-03-21 Thread GitBox
yaooqinn commented on a change in pull request #31921: URL: https://github.com/apache/spark/pull/31921#discussion_r598463339 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala ## @@ -130,13 +130,11 @@ class Par

[GitHub] [spark] yaooqinn commented on a change in pull request #31921: [SPARK-34817][SQL] Read parquet unsigned types that stored as int32 physical type in parquet

2021-03-21 Thread GitBox
yaooqinn commented on a change in pull request #31921: URL: https://github.com/apache/spark/pull/31921#discussion_r598463339 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala ## @@ -130,13 +130,11 @@ class Par

[GitHub] [spark] HyukjinKwon commented on pull request #31922: [SPARK-34818][PYTHON][DOCS] Reorder the items in User Guide at PySpark documentation

2021-03-21 Thread GitBox
HyukjinKwon commented on pull request #31922: URL: https://github.com/apache/spark/pull/31922#issuecomment-803811160 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [spark] HyukjinKwon closed pull request #31922: [SPARK-34818][PYTHON][DOCS] Reorder the items in User Guide at PySpark documentation

2021-03-21 Thread GitBox
HyukjinKwon closed pull request #31922: URL: https://github.com/apache/spark/pull/31922 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, pl

[GitHub] [spark] HyukjinKwon commented on pull request #31922: [SPARK-34818][PYTHON][DOCS] Reorder the items in User Guide at PySpark documentation

2021-03-21 Thread GitBox
HyukjinKwon commented on pull request #31922: URL: https://github.com/apache/spark/pull/31922#issuecomment-803810957 Thx! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. F

[GitHub] [spark] cloud-fan commented on pull request #31909: [SPARK-34809][CORE] Enable spark.hadoopRDD.ignoreEmptySplits by default

2021-03-21 Thread GitBox
cloud-fan commented on pull request #31909: URL: https://github.com/apache/spark/pull/31909#issuecomment-803809778 late LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] SparkQA commented on pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
SparkQA commented on pull request #31917: URL: https://github.com/apache/spark/pull/31917#issuecomment-803809435 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40914/ -- This is an automated message from the A

[GitHub] [spark] cloud-fan commented on pull request #29642: [SPARK-32792][SQL] Improve InSet filter pushdown

2021-03-21 Thread GitBox
cloud-fan commented on pull request #29642: URL: https://github.com/apache/spark/pull/29642#issuecomment-803808606 Is this patch still needed? IIRC we already have this in hive partition pruning. -- This is an automated message from the Apache Git Service. To respond to the message, plea

[GitHub] [spark] SparkQA removed a comment on pull request #31922: [SPARK-34818][PYTHON][DOCS] Reorder the items in User Guide at PySpark documentation

2021-03-21 Thread GitBox
SparkQA removed a comment on pull request #31922: URL: https://github.com/apache/spark/pull/31922#issuecomment-803790212 **[Test build #136332 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136332/testReport)** for PR 31922 at commit [`a01b89e`](https://gi

[GitHub] [spark] SparkQA commented on pull request #31922: [SPARK-34818][PYTHON][DOCS] Reorder the items in User Guide at PySpark documentation

2021-03-21 Thread GitBox
SparkQA commented on pull request #31922: URL: https://github.com/apache/spark/pull/31922#issuecomment-803807868 **[Test build #136332 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136332/testReport)** for PR 31922 at commit [`a01b89e`](https://github.co

[GitHub] [spark] Ngone51 commented on a change in pull request #31876: [WIP][SPARK-XXXX][API][CORE] Abstract Location in MapStatus to enable support for custom storage

2021-03-21 Thread GitBox
Ngone51 commented on a change in pull request #31876: URL: https://github.com/apache/spark/pull/31876#discussion_r598457153 ## File path: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ## @@ -28,16 +28,23 @@ import org.apache.spark.internal.config import org.ap

[GitHub] [spark] zhongyu09 commented on pull request #31167: [SPARK-33933][SQL] Materialize BroadcastQueryStage first to avoid broadcast timeout in AQE

2021-03-21 Thread GitBox
zhongyu09 commented on pull request #31167: URL: https://github.com/apache/spark/pull/31167#issuecomment-803803842 > This PR seems to be superseded by the author at #31269 > > ``` > [SPARK-33933][SQL] Materialize BroadcastQueryStage first to try to avoid broadcast timeout in AQE

[GitHub] [spark] cloud-fan commented on pull request #31842: [SPARK-34748][SS] Create a rule of the analysis logic for streaming write

2021-03-21 Thread GitBox
cloud-fan commented on pull request #31842: URL: https://github.com/apache/spark/pull/31842#issuecomment-803803661 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] cloud-fan closed pull request #31842: [SPARK-34748][SS] Create a rule of the analysis logic for streaming write

2021-03-21 Thread GitBox
cloud-fan closed pull request #31842: URL: https://github.com/apache/spark/pull/31842 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, plea

[GitHub] [spark] HyukjinKwon commented on a change in pull request #31921: [SPARK-34817][SQL] Read parquet unsigned types that stored as int32 physical type in parquet

2021-03-21 Thread GitBox
HyukjinKwon commented on a change in pull request #31921: URL: https://github.com/apache/spark/pull/31921#discussion_r598455740 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala ## @@ -130,13 +130,11 @@ class

[GitHub] [spark] SparkQA commented on pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
SparkQA commented on pull request #31917: URL: https://github.com/apache/spark/pull/31917#issuecomment-803802814 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40914/ -- This is an automated message from the Apache

[GitHub] [spark] HyukjinKwon commented on a change in pull request #31921: [SPARK-34817][SQL] Read parquet unsigned types that stored as int32 physical type in parquet

2021-03-21 Thread GitBox
HyukjinKwon commented on a change in pull request #31921: URL: https://github.com/apache/spark/pull/31921#discussion_r598455484 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala ## @@ -130,13 +130,11 @@ class

[GitHub] [spark] wangyum commented on a change in pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
wangyum commented on a change in pull request #31917: URL: https://github.com/apache/spark/pull/31917#discussion_r598455353 ## File path: sql/core/benchmarks/CSVBenchmark-results.txt ## @@ -2,66 +2,66 @@ Benchmark to measure CSV read/write performance ===

[GitHub] [spark] beliefer commented on pull request #31920: [SPARK-33604][SQL] Group exception messages in sql/execution

2021-03-21 Thread GitBox
beliefer commented on pull request #31920: URL: https://github.com/apache/spark/pull/31920#issuecomment-803802246 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] SparkQA commented on pull request #31921: [SPARK-34817][SQL] Read parquet unsigned types that stored as int32 physical type in parquet

2021-03-21 Thread GitBox
SparkQA commented on pull request #31921: URL: https://github.com/apache/spark/pull/31921#issuecomment-803801410 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40913/ -- This is an automated message from the A

[GitHub] [spark] yaooqinn commented on pull request #31921: [SPARK-34817][SQL] Read parquet unsigned types that stored as int32 physical type in parquet

2021-03-21 Thread GitBox
yaooqinn commented on pull request #31921: URL: https://github.com/apache/spark/pull/31921#issuecomment-803801061 cc @HyukjinKwon @cloud-fan @dongjoon-hyun thanks for reviewing -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] xuanyuanking commented on a change in pull request #31898: [SPARK-34790][CORE] Disable fetching shuffle blocks in batch when io encryption is enabled

2021-03-21 Thread GitBox
xuanyuanking commented on a change in pull request #31898: URL: https://github.com/apache/spark/pull/31898#discussion_r598454149 ## File path: core/src/main/scala/org/apache/spark/shuffle/BlockStoreShuffleReader.scala ## @@ -51,15 +51,17 @@ private[spark] class BlockStoreShuff

[GitHub] [spark] cloud-fan commented on pull request #31898: [SPARK-34790][CORE] Disable fetching shuffle blocks in batch when io encryption is enabled

2021-03-21 Thread GitBox
cloud-fan commented on pull request #31898: URL: https://github.com/apache/spark/pull/31898#issuecomment-803800454 This reminds me of `CompressionCodec.supportsConcatenationOfSerializedStreams`. We have to disable the shuffle batch fetch if it can't work. This patch LGTM. Later on,

[GitHub] [spark] MaxGekk commented on a change in pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
MaxGekk commented on a change in pull request #31917: URL: https://github.com/apache/spark/pull/31917#discussion_r598452062 ## File path: sql/core/benchmarks/CSVBenchmark-results.txt ## @@ -2,66 +2,66 @@ Benchmark to measure CSV read/write performance ===

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #31919: [SPARK-34087][FOLLOW-UP][SQL] Manage ExecutionListenerBus register inside itself

2021-03-21 Thread GitBox
dongjoon-hyun commented on a change in pull request #31919: URL: https://github.com/apache/spark/pull/31919#discussion_r598451669 ## File path: core/src/main/scala/org/apache/spark/ContextCleaner.scala ## @@ -172,18 +172,18 @@ private[spark] class ContextCleaner( registerF

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31919: [SPARK-34087][FOLLOW-UP][SQL] Manage ExecutionListenerBus register inside itself

2021-03-21 Thread GitBox
AmplabJenkins removed a comment on pull request #31919: URL: https://github.com/apache/spark/pull/31919#issuecomment-803796617 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136325/ -

[GitHub] [spark] AmplabJenkins commented on pull request #31919: [SPARK-34087][FOLLOW-UP][SQL] Manage ExecutionListenerBus register inside itself

2021-03-21 Thread GitBox
AmplabJenkins commented on pull request #31919: URL: https://github.com/apache/spark/pull/31919#issuecomment-803796617 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136325/ -- This

[GitHub] [spark] dongjoon-hyun commented on pull request #31918: Revert "[SPARK-34757][CORE][DEPLOY] Ignore cache for SNAPSHOT dependencies in spark-submit"

2021-03-21 Thread GitBox
dongjoon-hyun commented on pull request #31918: URL: https://github.com/apache/spark/pull/31918#issuecomment-803796312 Thank you, @bozhang2820 and @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] SparkQA commented on pull request #31921: [SPARK-34817][SQL] Read parquet unsigned types that stored as int32 physical type in parquet

2021-03-21 Thread GitBox
SparkQA commented on pull request #31921: URL: https://github.com/apache/spark/pull/31921#issuecomment-803795712 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40913/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA removed a comment on pull request #31919: [SPARK-34087][FOLLOW-UP][SQL] Manage ExecutionListenerBus register inside itself

2021-03-21 Thread GitBox
SparkQA removed a comment on pull request #31919: URL: https://github.com/apache/spark/pull/31919#issuecomment-803732611 **[Test build #136325 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136325/testReport)** for PR 31919 at commit [`53be3fa`](https://gi

[GitHub] [spark] SparkQA commented on pull request #31919: [SPARK-34087][FOLLOW-UP][SQL] Manage ExecutionListenerBus register inside itself

2021-03-21 Thread GitBox
SparkQA commented on pull request #31919: URL: https://github.com/apache/spark/pull/31919#issuecomment-803795594 **[Test build #136325 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136325/testReport)** for PR 31919 at commit [`53be3fa`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
SparkQA commented on pull request #31917: URL: https://github.com/apache/spark/pull/31917#issuecomment-803794304 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40912/ -- This is an automated message from the A

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
AmplabJenkins removed a comment on pull request #31917: URL: https://github.com/apache/spark/pull/31917#issuecomment-803794337 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40912/

[GitHub] [spark] AmplabJenkins commented on pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
AmplabJenkins commented on pull request #31917: URL: https://github.com/apache/spark/pull/31917#issuecomment-803794337 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40912/ -- T

[GitHub] [spark] wangyum commented on a change in pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
wangyum commented on a change in pull request #31917: URL: https://github.com/apache/spark/pull/31917#discussion_r598449310 ## File path: sql/core/benchmarks/CSVBenchmark-results.txt ## @@ -2,66 +2,66 @@ Benchmark to measure CSV read/write performance ===

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31918: Revert "[SPARK-34757][CORE][DEPLOY] Ignore cache for SNAPSHOT dependencies in spark-submit"

2021-03-21 Thread GitBox
AmplabJenkins removed a comment on pull request #31918: URL: https://github.com/apache/spark/pull/31918#issuecomment-803793642 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136321/ -

[GitHub] [spark] AmplabJenkins commented on pull request #31918: Revert "[SPARK-34757][CORE][DEPLOY] Ignore cache for SNAPSHOT dependencies in spark-submit"

2021-03-21 Thread GitBox
AmplabJenkins commented on pull request #31918: URL: https://github.com/apache/spark/pull/31918#issuecomment-803793642 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136321/ -- This

[GitHub] [spark] HyukjinKwon edited a comment on pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
HyukjinKwon edited a comment on pull request #31917: URL: https://github.com/apache/spark/pull/31917#issuecomment-803792545 @MaxGekk, We should better have a way to do that, or at least document that we should do extra steps. All I read is: https://github.com/apache/spark/blob/d65f534c5

[GitHub] [spark] HyukjinKwon commented on pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
HyukjinKwon commented on pull request #31917: URL: https://github.com/apache/spark/pull/31917#issuecomment-803792545 @MaxGekk, We should better have a way to do that, or at least document that we should do that. All I read is: https://github.com/apache/spark/blob/d65f534c5ad4385b7c5198f

[GitHub] [spark] SparkQA removed a comment on pull request #31918: Revert "[SPARK-34757][CORE][DEPLOY] Ignore cache for SNAPSHOT dependencies in spark-submit"

2021-03-21 Thread GitBox
SparkQA removed a comment on pull request #31918: URL: https://github.com/apache/spark/pull/31918#issuecomment-803730608 **[Test build #136321 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136321/testReport)** for PR 31918 at commit [`19b7172`](https://gi

[GitHub] [spark] SparkQA commented on pull request #31918: Revert "[SPARK-34757][CORE][DEPLOY] Ignore cache for SNAPSHOT dependencies in spark-submit"

2021-03-21 Thread GitBox
SparkQA commented on pull request #31918: URL: https://github.com/apache/spark/pull/31918#issuecomment-803792375 **[Test build #136321 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136321/testReport)** for PR 31918 at commit [`19b7172`](https://github.co

[GitHub] [spark] MaxGekk commented on a change in pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
MaxGekk commented on a change in pull request #31917: URL: https://github.com/apache/spark/pull/31917#discussion_r598447129 ## File path: sql/core/benchmarks/CSVBenchmark-results.txt ## @@ -2,66 +2,66 @@ Benchmark to measure CSV read/write performance ===

[GitHub] [spark] SparkQA commented on pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
SparkQA commented on pull request #31917: URL: https://github.com/apache/spark/pull/31917#issuecomment-803790976 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40912/ -- This is an automated message from the Apache

[GitHub] [spark] MaxGekk commented on pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
MaxGekk commented on pull request #31917: URL: https://github.com/apache/spark/pull/31917#issuecomment-803790641 @HyukjinKwon I care of reproducible benchmark results. Currently, you don't provide enough info to reproduce the same. I would prefer to follow scientific approach, and have a c

[GitHub] [spark] SparkQA commented on pull request #31919: [SPARK-34087][FOLLOW-UP][SQL] Manage ExecutionListenerBus register inside itself

2021-03-21 Thread GitBox
SparkQA commented on pull request #31919: URL: https://github.com/apache/spark/pull/31919#issuecomment-803790245 **[Test build #136333 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136333/testReport)** for PR 31919 at commit [`dfd7d38`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #31922: [SPARK-34818][PYTHON][DOCS] Reorder the items in User Guide at PySpark documentation

2021-03-21 Thread GitBox
SparkQA commented on pull request #31922: URL: https://github.com/apache/spark/pull/31922#issuecomment-803790212 **[Test build #136332 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136332/testReport)** for PR 31922 at commit [`a01b89e`](https://github.com

[GitHub] [spark] HyukjinKwon commented on pull request #27356: [SPARK-29924][DOCS] Document Apache Arrow JDK11 requirement

2021-03-21 Thread GitBox
HyukjinKwon commented on pull request #27356: URL: https://github.com/apache/spark/pull/27356#issuecomment-803789413 Sure, thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [spark] HyukjinKwon closed pull request #31918: Revert "[SPARK-34757][CORE][DEPLOY] Ignore cache for SNAPSHOT dependencies in spark-submit"

2021-03-21 Thread GitBox
HyukjinKwon closed pull request #31918: URL: https://github.com/apache/spark/pull/31918 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, pl

[GitHub] [spark] romainx commented on pull request #27356: [SPARK-29924][DOCS] Document Apache Arrow JDK11 requirement

2021-03-21 Thread GitBox
romainx commented on pull request #27356: URL: https://github.com/apache/spark/pull/27356#issuecomment-803788747 @HyukjinKwon sure I will have a look and I will try to draft a PR, it could take some time since it's my first contribution here. Thanks for the proposal. -- This is an automa

[GitHub] [spark] HyukjinKwon commented on pull request #31918: Revert "[SPARK-34757][CORE][DEPLOY] Ignore cache for SNAPSHOT dependencies in spark-submit"

2021-03-21 Thread GitBox
HyukjinKwon commented on pull request #31918: URL: https://github.com/apache/spark/pull/31918#issuecomment-803788733 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31847: [SPARK-34755][SQL] Support the utils for transform number format

2021-03-21 Thread GitBox
AmplabJenkins removed a comment on pull request #31847: URL: https://github.com/apache/spark/pull/31847#issuecomment-803787636 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40910/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31920: [SPARK-33604][SQL] Group exception messages in sql/execution

2021-03-21 Thread GitBox
AmplabJenkins removed a comment on pull request #31920: URL: https://github.com/apache/spark/pull/31920#issuecomment-803787637 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40911/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31921: [SPARK-34817][SQL] Read parquet unsigned types that stored as int32 physical type in parquet

2021-03-21 Thread GitBox
AmplabJenkins removed a comment on pull request #31921: URL: https://github.com/apache/spark/pull/31921#issuecomment-803787635 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40909/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31517: [WIP][SPARK-34309][BUILD][CORE][SQL] [K8S]Use Caffeine instead of Guava Cache

2021-03-21 Thread GitBox
AmplabJenkins removed a comment on pull request #31517: URL: https://github.com/apache/spark/pull/31517#issuecomment-803787983 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136324/ -

[GitHub] [spark] AmplabJenkins commented on pull request #31517: [WIP][SPARK-34309][BUILD][CORE][SQL] [K8S]Use Caffeine instead of Guava Cache

2021-03-21 Thread GitBox
AmplabJenkins commented on pull request #31517: URL: https://github.com/apache/spark/pull/31517#issuecomment-803787983 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136324/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #31920: [SPARK-33604][SQL] Group exception messages in sql/execution

2021-03-21 Thread GitBox
AmplabJenkins commented on pull request #31920: URL: https://github.com/apache/spark/pull/31920#issuecomment-803787637 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40911/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #31921: [SPARK-34817][SQL] Read parquet unsigned types that stored as int32 physical type in parquet

2021-03-21 Thread GitBox
AmplabJenkins commented on pull request #31921: URL: https://github.com/apache/spark/pull/31921#issuecomment-803787635 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40909/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #31847: [SPARK-34755][SQL] Support the utils for transform number format

2021-03-21 Thread GitBox
AmplabJenkins commented on pull request #31847: URL: https://github.com/apache/spark/pull/31847#issuecomment-803787636 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40910/ -- T

[GitHub] [spark] HyukjinKwon commented on pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
HyukjinKwon commented on pull request #31917: URL: https://github.com/apache/spark/pull/31917#issuecomment-803787041 @MaxGekk If we care about that, it would be great if we include that in benchmark results. -- This is an automated message from the Apache Git Service. To respond to the m

[GitHub] [spark] SparkQA commented on pull request #31921: [SPARK-34817][SQL] Read parquet unsigned types that stored as int32 physical type in parquet

2021-03-21 Thread GitBox
SparkQA commented on pull request #31921: URL: https://github.com/apache/spark/pull/31921#issuecomment-803786190 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40909/ -- This is an automated message from the A

[GitHub] [spark] MaxGekk commented on pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
MaxGekk commented on pull request #31917: URL: https://github.com/apache/spark/pull/31917#issuecomment-803785118 @HyukjinKwon The purpose is to give others enough info about the environment to get the same benchmark results. Do you really think that: ``` Java HotSpot(TM) 64-Bit Server

[GitHub] [spark] cloud-fan commented on a change in pull request #31898: [SPARK-34790][CORE] Disable fetching shuffle blocks in batch when io encryption is enabled

2021-03-21 Thread GitBox
cloud-fan commented on a change in pull request #31898: URL: https://github.com/apache/spark/pull/31898#discussion_r598442650 ## File path: core/src/main/scala/org/apache/spark/shuffle/BlockStoreShuffleReader.scala ## @@ -51,15 +51,17 @@ private[spark] class BlockStoreShuffleR

[GitHub] [spark] HyukjinKwon commented on a change in pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
HyukjinKwon commented on a change in pull request #31917: URL: https://github.com/apache/spark/pull/31917#discussion_r598442365 ## File path: sql/core/benchmarks/CSVBenchmark-results.txt ## @@ -2,66 +2,66 @@ Benchmark to measure CSV read/write performance ===

[GitHub] [spark] HyukjinKwon commented on a change in pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
HyukjinKwon commented on a change in pull request #31917: URL: https://github.com/apache/spark/pull/31917#discussion_r598440649 ## File path: sql/core/benchmarks/CSVBenchmark-results.txt ## @@ -2,66 +2,66 @@ Benchmark to measure CSV read/write performance ===

[GitHub] [spark] HyukjinKwon commented on pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
HyukjinKwon commented on pull request #31917: URL: https://github.com/apache/spark/pull/31917#issuecomment-803782570 I think the benchmark results include that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] HyukjinKwon commented on a change in pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
HyukjinKwon commented on a change in pull request #31917: URL: https://github.com/apache/spark/pull/31917#discussion_r598440649 ## File path: sql/core/benchmarks/CSVBenchmark-results.txt ## @@ -2,66 +2,66 @@ Benchmark to measure CSV read/write performance ===

[GitHub] [spark] MaxGekk commented on pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
MaxGekk commented on pull request #31917: URL: https://github.com/apache/spark/pull/31917#issuecomment-803780677 @HyukjinKwon Could you update PR's description and point out the environment in which you run the benchmark, please. -- This is an automated message from the Apache Git Servic

[GitHub] [spark] SparkQA commented on pull request #31921: [SPARK-34817][SQL] Read parquet unsigned types that stored as int32 physical type in parquet

2021-03-21 Thread GitBox
SparkQA commented on pull request #31921: URL: https://github.com/apache/spark/pull/31921#issuecomment-803780201 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40909/ -- This is an automated message from the Apache

[GitHub] [spark] MaxGekk commented on a change in pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
MaxGekk commented on a change in pull request #31917: URL: https://github.com/apache/spark/pull/31917#discussion_r598438961 ## File path: sql/core/benchmarks/CSVBenchmark-results.txt ## @@ -2,66 +2,66 @@ Benchmark to measure CSV read/write performance ===

[GitHub] [spark] SparkQA removed a comment on pull request #31517: [WIP][SPARK-34309][BUILD][CORE][SQL] [K8S]Use Caffeine instead of Guava Cache

2021-03-21 Thread GitBox
SparkQA removed a comment on pull request #31517: URL: https://github.com/apache/spark/pull/31517#issuecomment-803730936 **[Test build #136324 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136324/testReport)** for PR 31517 at commit [`adc6d92`](https://gi

[GitHub] [spark] SparkQA commented on pull request #31517: [WIP][SPARK-34309][BUILD][CORE][SQL] [K8S]Use Caffeine instead of Guava Cache

2021-03-21 Thread GitBox
SparkQA commented on pull request #31517: URL: https://github.com/apache/spark/pull/31517#issuecomment-803778202 **[Test build #136324 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136324/testReport)** for PR 31517 at commit [`adc6d92`](https://github.co

[GitHub] [spark] HyukjinKwon commented on pull request #31922: [SPARK-34818][PYTHON][DOCS] Reorder the items in User Guide at PySpark documentation

2021-03-21 Thread GitBox
HyukjinKwon commented on pull request #31922: URL: https://github.com/apache/spark/pull/31922#issuecomment-803777036 cc @zero323, @srowen, @viirya can you take a quick look please? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

[GitHub] [spark] HyukjinKwon opened a new pull request #31922: [SPARK-34818][PYTHON][DOCS] Reorder the items in User Guide at PySpark documentation

2021-03-21 Thread GitBox
HyukjinKwon opened a new pull request #31922: URL: https://github.com/apache/spark/pull/31922 ### What changes were proposed in this pull request? This PR proposes to reorder the items in User Guide in PySpark documentation in order to place general guides first and advance ones late

[GitHub] [spark] SparkQA commented on pull request #31920: [SPARK-33604][SQL] Group exception messages in sql/execution

2021-03-21 Thread GitBox
SparkQA commented on pull request #31920: URL: https://github.com/apache/spark/pull/31920#issuecomment-803776259 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40911/ -- This

[GitHub] [spark] SparkQA commented on pull request #31886: [WIP][SPARK-34795][SQL][TEST] Adds a new job in GitHub Actions to check the output of TPC-DS queries

2021-03-21 Thread GitBox
SparkQA commented on pull request #31886: URL: https://github.com/apache/spark/pull/31886#issuecomment-803773652 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40905/ -- This

[GitHub] [spark] SparkQA commented on pull request #31847: [SPARK-34755][SQL] Support the utils for transform number format

2021-03-21 Thread GitBox
SparkQA commented on pull request #31847: URL: https://github.com/apache/spark/pull/31847#issuecomment-803772614 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40910/ -- This is an automated message from the A

[GitHub] [spark] SparkQA commented on pull request #31847: [SPARK-34755][SQL] Support the utils for transform number format

2021-03-21 Thread GitBox
SparkQA commented on pull request #31847: URL: https://github.com/apache/spark/pull/31847#issuecomment-803768105 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40910/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
SparkQA commented on pull request #31917: URL: https://github.com/apache/spark/pull/31917#issuecomment-803766091 **[Test build #136331 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136331/testReport)** for PR 31917 at commit [`3575e48`](https://github.com

[GitHub] [spark] HyukjinKwon commented on a change in pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
HyukjinKwon commented on a change in pull request #31917: URL: https://github.com/apache/spark/pull/31917#discussion_r598428153 ## File path: sql/core/benchmarks/CSVBenchmark-results.txt ## @@ -2,66 +2,66 @@ Benchmark to measure CSV read/write performance ===

[GitHub] [spark] sarutak commented on pull request #31718: [SPARK-34225][CORE] Don't encode further when a URI form string is passed to addFile or addJar

2021-03-21 Thread GitBox
sarutak commented on pull request #31718: URL: https://github.com/apache/spark/pull/31718#issuecomment-803764301 Merged to `master` and `branch-3.1`. Thank you @srowen and @HyukjinKwon . -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] asfgit closed pull request #31718: [SPARK-34225][CORE] Don't encode further when a URI form string is passed to addFile or addJar

2021-03-21 Thread GitBox
asfgit closed pull request #31718: URL: https://github.com/apache/spark/pull/31718 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [spark] SparkQA commented on pull request #31921: [SPARK-34817][SQL] Read parquet unsigned types that stored as int32 physical type in parquet

2021-03-21 Thread GitBox
SparkQA commented on pull request #31921: URL: https://github.com/apache/spark/pull/31921#issuecomment-803763315 **[Test build #136330 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136330/testReport)** for PR 31921 at commit [`8ff3267`](https://github.com

[GitHub] [spark] MaxGekk commented on a change in pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
MaxGekk commented on a change in pull request #31917: URL: https://github.com/apache/spark/pull/31917#discussion_r598427096 ## File path: sql/core/benchmarks/CSVBenchmark-results.txt ## @@ -2,66 +2,66 @@ Benchmark to measure CSV read/write performance ===

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31918: Revert "[SPARK-34757][CORE][DEPLOY] Ignore cache for SNAPSHOT dependencies in spark-submit"

2021-03-21 Thread GitBox
AmplabJenkins removed a comment on pull request #31918: URL: https://github.com/apache/spark/pull/31918#issuecomment-803762578 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40903/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31919: [SPARK-34087][FOLLOW-UP][SQL] Manage ExecutionListenerBus register inside itself

2021-03-21 Thread GitBox
AmplabJenkins removed a comment on pull request #31919: URL: https://github.com/apache/spark/pull/31919#issuecomment-803762577 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40907/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31921: [SPARK-34817][SQL] Read parquet unsigned types that stored as int32 physical type in parquet

2021-03-21 Thread GitBox
AmplabJenkins removed a comment on pull request #31921: URL: https://github.com/apache/spark/pull/31921#issuecomment-803762580 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136327/ -

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
AmplabJenkins removed a comment on pull request #31917: URL: https://github.com/apache/spark/pull/31917#issuecomment-803762552 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] AmplabJenkins commented on pull request #31918: Revert "[SPARK-34757][CORE][DEPLOY] Ignore cache for SNAPSHOT dependencies in spark-submit"

2021-03-21 Thread GitBox
AmplabJenkins commented on pull request #31918: URL: https://github.com/apache/spark/pull/31918#issuecomment-803762578 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40903/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
AmplabJenkins commented on pull request #31917: URL: https://github.com/apache/spark/pull/31917#issuecomment-803762552 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For q

[GitHub] [spark] AmplabJenkins commented on pull request #31919: [SPARK-34087][FOLLOW-UP][SQL] Manage ExecutionListenerBus register inside itself

2021-03-21 Thread GitBox
AmplabJenkins commented on pull request #31919: URL: https://github.com/apache/spark/pull/31919#issuecomment-803762577 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40907/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #31921: [SPARK-34817][SQL] Read parquet unsigned types that stored as int32 physical type in parquet

2021-03-21 Thread GitBox
AmplabJenkins commented on pull request #31921: URL: https://github.com/apache/spark/pull/31921#issuecomment-803762580 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136327/ -- This

[GitHub] [spark] SparkQA removed a comment on pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
SparkQA removed a comment on pull request #31917: URL: https://github.com/apache/spark/pull/31917#issuecomment-803730648 **[Test build #136322 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136322/testReport)** for PR 31917 at commit [`750f92b`](https://gi

[GitHub] [spark] SparkQA commented on pull request #31917: [SPARK-34815][SQL] Update CSVBenchmark

2021-03-21 Thread GitBox
SparkQA commented on pull request #31917: URL: https://github.com/apache/spark/pull/31917#issuecomment-803761914 **[Test build #136322 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136322/testReport)** for PR 31917 at commit [`750f92b`](https://github.co

[GitHub] [spark] MaxGekk closed pull request #30678: [MINOR][SQL] Spelling: filters - PushedFilers

2021-03-21 Thread GitBox
MaxGekk closed pull request #30678: URL: https://github.com/apache/spark/pull/30678 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [spark] MaxGekk commented on pull request #30678: [MINOR][SQL] Spelling: filters - PushedFilers

2021-03-21 Thread GitBox
MaxGekk commented on pull request #30678: URL: https://github.com/apache/spark/pull/30678#issuecomment-803760318 +1, LGTM, I am merging this to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] SparkQA commented on pull request #31919: [SPARK-34087][FOLLOW-UP][SQL] Manage ExecutionListenerBus register inside itself

2021-03-21 Thread GitBox
SparkQA commented on pull request #31919: URL: https://github.com/apache/spark/pull/31919#issuecomment-803756870 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40907/ -- This is an automated message from the A

[GitHub] [spark] SparkQA commented on pull request #31918: Revert "[SPARK-34757][CORE][DEPLOY] Ignore cache for SNAPSHOT dependencies in spark-submit"

2021-03-21 Thread GitBox
SparkQA commented on pull request #31918: URL: https://github.com/apache/spark/pull/31918#issuecomment-803753903 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40903/ -- This is an automated message from the A

[GitHub] [spark] SparkQA commented on pull request #31919: [SPARK-34087][FOLLOW-UP][SQL] Manage ExecutionListenerBus register inside itself

2021-03-21 Thread GitBox
SparkQA commented on pull request #31919: URL: https://github.com/apache/spark/pull/31919#issuecomment-803753661 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40907/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA removed a comment on pull request #31921: [SPARK-34817][SQL] Read parquet unsigned types that stored as int32 physical type in parquet

2021-03-21 Thread GitBox
SparkQA removed a comment on pull request #31921: URL: https://github.com/apache/spark/pull/31921#issuecomment-803745469 **[Test build #136327 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136327/testReport)** for PR 31921 at commit [`0c8b6d4`](https://gi

  1   2   3   4   5   >