[GitHub] [spark] zero323 commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

2020-10-25 Thread GitBox
zero323 commented on pull request #30143: URL: https://github.com/apache/spark/pull/30143#issuecomment-716107740 > > Add proper NumPy-style docstrings to expanded functions. > > Oh, let's don't do this in this PR. OK. Shall we extend existing docstrings here at all? And

[GitHub] [spark] SparkQA commented on pull request #30139: [SPARK-31069][CORE] high cpu caused by chunksBeingTransferred in external shuffle service

2020-10-25 Thread GitBox
SparkQA commented on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-716107758 **[Test build #130241 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130241/testReport)** for PR 30139 at commit [`0654684`](https://github.com

[GitHub] [spark] AngersZhuuuu commented on pull request #30139: [SPARK-31069][CORE] high cpu caused by chunksBeingTransferred in external shuffle service

2020-10-25 Thread GitBox
AngersZh commented on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-716108096 > Could you add an empty commit whose authorship is the original author, @AngersZh . If then, Apache Spark merge script can give both of you the authorship properly.

[GitHub] [spark] SparkQA commented on pull request #30139: [SPARK-31069][CORE] high cpu caused by chunksBeingTransferred in external shuffle service

2020-10-25 Thread GitBox
SparkQA commented on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-716111959 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34841/ -

[GitHub] [spark] AmplabJenkins commented on pull request #30139: [SPARK-31069][CORE] high cpu caused by chunksBeingTransferred in external shuffle service

2020-10-25 Thread GitBox
AmplabJenkins commented on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-716114251 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #30139: [SPARK-31069][CORE] high cpu caused by chunksBeingTransferred in external shuffle service

2020-10-25 Thread GitBox
SparkQA commented on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-716114246 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34841/ ---

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30139: [SPARK-31069][CORE] high cpu caused by chunksBeingTransferred in external shuffle service

2020-10-25 Thread GitBox
AmplabJenkins removed a comment on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-716114251 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AngersZhuuuu commented on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2020-10-25 Thread GitBox
AngersZh commented on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-716122510 retest this please This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [spark] AngersZhuuuu commented on pull request #30144: [SPARK-33229][SQL]Support GROUP BY use Separate columns and CUBE/ROLLUP

2020-10-25 Thread GitBox
AngersZh commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-716122535 retest this please This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [spark] SparkQA commented on pull request #30139: [SPARK-31069][CORE] high cpu caused by chunksBeingTransferred in external shuffle service

2020-10-25 Thread GitBox
SparkQA commented on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-716122596 **[Test build #130241 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130241/testReport)** for PR 30139 at commit [`0654684`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #30139: [SPARK-31069][CORE] high cpu caused by chunksBeingTransferred in external shuffle service

2020-10-25 Thread GitBox
SparkQA removed a comment on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-716107758 **[Test build #130241 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130241/testReport)** for PR 30139 at commit [`0654684`](https://gi

[GitHub] [spark] AngersZhuuuu commented on pull request #29421: [SPARK-32388][SQL][test-hadoop2.7][test-hive1.2] TRANSFORM with schema-less mode should keep the same with hive

2020-10-25 Thread GitBox
AngersZh commented on pull request #29421: URL: https://github.com/apache/spark/pull/29421#issuecomment-716122843 ping @cloud-fan @dongjoon-hyun This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] SparkQA commented on pull request #30144: [SPARK-33229][SQL]Support GROUP BY use Separate columns and CUBE/ROLLUP

2020-10-25 Thread GitBox
SparkQA commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-716122765 **[Test build #130243 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130243/testReport)** for PR 30144 at commit [`68c3e48`](https://github.com

[GitHub] [spark] AngersZhuuuu removed a comment on pull request #29414: [SPARK-32106][SQL] Implement script transform in sql/core

2020-10-25 Thread GitBox
AngersZh removed a comment on pull request #29414: URL: https://github.com/apache/spark/pull/29414#issuecomment-704654464 ping @cloud-fan This is an automated message from the Apache Git Service. To respond to the messag

[GitHub] [spark] AngersZhuuuu commented on pull request #29414: [SPARK-32106][SQL] Implement script transform in sql/core

2020-10-25 Thread GitBox
AngersZh commented on pull request #29414: URL: https://github.com/apache/spark/pull/29414#issuecomment-716122778 ping @cloud-fan @dongjoon-hyun This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] SparkQA commented on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2020-10-25 Thread GitBox
SparkQA commented on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-716122750 **[Test build #130242 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130242/testReport)** for PR 30145 at commit [`fa3346c`](https://github.com

[GitHub] [spark] AngersZhuuuu removed a comment on pull request #29421: [SPARK-32388][SQL][test-hadoop2.7][test-hive1.2] TRANSFORM with schema-less mode should keep the same with hive

2020-10-25 Thread GitBox
AngersZh removed a comment on pull request #29421: URL: https://github.com/apache/spark/pull/29421#issuecomment-696595816 ping @cloud-fan This is an automated message from the Apache Git Service. To respond to the messag

[GitHub] [spark] AmplabJenkins commented on pull request #30139: [SPARK-31069][CORE] high cpu caused by chunksBeingTransferred in external shuffle service

2020-10-25 Thread GitBox
AmplabJenkins commented on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-716122821 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30139: [SPARK-31069][CORE] high cpu caused by chunksBeingTransferred in external shuffle service

2020-10-25 Thread GitBox
AmplabJenkins removed a comment on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-716122821 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2020-10-25 Thread GitBox
SparkQA commented on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-716127238 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34842/ -

[GitHub] [spark] SparkQA commented on pull request #30144: [SPARK-33229][SQL]Support GROUP BY use Separate columns and CUBE/ROLLUP

2020-10-25 Thread GitBox
SparkQA commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-716128184 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34843/ -

[GitHub] [spark] AmplabJenkins commented on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2020-10-25 Thread GitBox
AmplabJenkins commented on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-716130179 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2020-10-25 Thread GitBox
SparkQA commented on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-716130173 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34842/ ---

[GitHub] [spark] SparkQA commented on pull request #30144: [SPARK-33229][SQL]Support GROUP BY use Separate columns and CUBE/ROLLUP

2020-10-25 Thread GitBox
SparkQA commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-716131521 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34843/ ---

[GitHub] [spark] AmplabJenkins commented on pull request #30144: [SPARK-33229][SQL]Support GROUP BY use Separate columns and CUBE/ROLLUP

2020-10-25 Thread GitBox
AmplabJenkins commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-716131532 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2020-10-25 Thread GitBox
AmplabJenkins removed a comment on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-716130179 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30144: [SPARK-33229][SQL]Support GROUP BY use Separate columns and CUBE/ROLLUP

2020-10-25 Thread GitBox
AmplabJenkins removed a comment on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-716131532 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] srowen commented on pull request #23066: [SPARK-26043][CORE] Make SparkHadoopUtil private to Spark

2020-10-25 Thread GitBox
srowen commented on pull request #23066: URL: https://github.com/apache/spark/pull/23066#issuecomment-716145969 The idea is that it was never meant for users (developer API) and wasn't easy to make stable. You can simply replicate its code in your code base; it's just a bit of helper code.

[GitHub] [spark] SparkQA commented on pull request #29642: [SPARK-32792][SQL] Improve in filter pushdown for ParquetFilters

2020-10-25 Thread GitBox
SparkQA commented on pull request #29642: URL: https://github.com/apache/spark/pull/29642#issuecomment-716153233 **[Test build #130244 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130244/testReport)** for PR 29642 at commit [`c5ab656`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #29642: [SPARK-32792][SQL] Improve in filter pushdown for ParquetFilters

2020-10-25 Thread GitBox
SparkQA commented on pull request #29642: URL: https://github.com/apache/spark/pull/29642#issuecomment-716155498 **[Test build #130245 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130245/testReport)** for PR 29642 at commit [`0169114`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #30144: [SPARK-33229][SQL]Support GROUP BY use Separate columns and CUBE/ROLLUP

2020-10-25 Thread GitBox
SparkQA commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-716157364 **[Test build #130243 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130243/testReport)** for PR 30144 at commit [`68c3e48`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2020-10-25 Thread GitBox
SparkQA commented on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-716157394 **[Test build #130242 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130242/testReport)** for PR 30145 at commit [`fa3346c`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #30144: [SPARK-33229][SQL]Support GROUP BY use Separate columns and CUBE/ROLLUP

2020-10-25 Thread GitBox
SparkQA removed a comment on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-716122765 **[Test build #130243 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130243/testReport)** for PR 30144 at commit [`68c3e48`](https://gi

[GitHub] [spark] SparkQA removed a comment on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2020-10-25 Thread GitBox
SparkQA removed a comment on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-716122750 **[Test build #130242 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130242/testReport)** for PR 30145 at commit [`fa3346c`](https://gi

[GitHub] [spark] AmplabJenkins commented on pull request #30144: [SPARK-33229][SQL]Support GROUP BY use Separate columns and CUBE/ROLLUP

2020-10-25 Thread GitBox
AmplabJenkins commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-716157686 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2020-10-25 Thread GitBox
AmplabJenkins commented on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-716157734 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2020-10-25 Thread GitBox
AmplabJenkins removed a comment on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-716157734 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30144: [SPARK-33229][SQL]Support GROUP BY use Separate columns and CUBE/ROLLUP

2020-10-25 Thread GitBox
AmplabJenkins removed a comment on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-716157686 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #29642: [SPARK-32792][SQL] Improve in filter pushdown for ParquetFilters

2020-10-25 Thread GitBox
SparkQA commented on pull request #29642: URL: https://github.com/apache/spark/pull/29642#issuecomment-716159240 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34844/ -

[GitHub] [spark] wangyum commented on a change in pull request #29642: [SPARK-32792][SQL] Improve in filter pushdown for ParquetFilters

2020-10-25 Thread GitBox
wangyum commented on a change in pull request #29642: URL: https://github.com/apache/spark/pull/29642#discussion_r511606498 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala ## @@ -597,12 +599,26 @@ class ParquetFilte

[GitHub] [spark] SparkQA commented on pull request #29642: [SPARK-32792][SQL] Improve in filter pushdown for ParquetFilters

2020-10-25 Thread GitBox
SparkQA commented on pull request #29642: URL: https://github.com/apache/spark/pull/29642#issuecomment-716162644 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] AmplabJenkins commented on pull request #29642: [SPARK-32792][SQL] Improve in filter pushdown for ParquetFilters

2020-10-25 Thread GitBox
AmplabJenkins commented on pull request #29642: URL: https://github.com/apache/spark/pull/29642#issuecomment-716162655 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29642: [SPARK-32792][SQL] Improve in filter pushdown for ParquetFilters

2020-10-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29642: URL: https://github.com/apache/spark/pull/29642#issuecomment-716162655 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29642: [SPARK-32792][SQL] Improve in filter pushdown for ParquetFilters

2020-10-25 Thread GitBox
AmplabJenkins commented on pull request #29642: URL: https://github.com/apache/spark/pull/29642#issuecomment-716166198 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #29642: [SPARK-32792][SQL] Improve in filter pushdown for ParquetFilters

2020-10-25 Thread GitBox
SparkQA commented on pull request #29642: URL: https://github.com/apache/spark/pull/29642#issuecomment-716166190 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34845/ ---

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29642: [SPARK-32792][SQL] Improve in filter pushdown for ParquetFilters

2020-10-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29642: URL: https://github.com/apache/spark/pull/29642#issuecomment-716166198 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] boy-uber commented on pull request #30004: [SPARK-33114][CORE] Add metadata in MapStatus to support custom shuffle manager

2020-10-25 Thread GitBox
boy-uber commented on pull request #30004: URL: https://github.com/apache/spark/pull/30004#issuecomment-716185714 > @boy-uber I feel that #28618 and previously related works seems doing what you want here. Yes, mccheah was working on #28618 to extend map status metadata, and he stop

[GitHub] [spark] SparkQA commented on pull request #29642: [SPARK-32792][SQL] Improve in filter pushdown for ParquetFilters

2020-10-25 Thread GitBox
SparkQA commented on pull request #29642: URL: https://github.com/apache/spark/pull/29642#issuecomment-716188129 **[Test build #130244 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130244/testReport)** for PR 29642 at commit [`c5ab656`](https://github.co

[GitHub] [spark] mridulm commented on pull request #30004: [SPARK-33114][CORE] Add metadata in MapStatus to support custom shuffle manager

2020-10-25 Thread GitBox
mridulm commented on pull request #30004: URL: https://github.com/apache/spark/pull/30004#issuecomment-716188127 I was not aware that #28618 was abandoned. +CC @mccheah This is an automated message from the Apache Git Servi

[GitHub] [spark] AmplabJenkins commented on pull request #29642: [SPARK-32792][SQL] Improve in filter pushdown for ParquetFilters

2020-10-25 Thread GitBox
AmplabJenkins commented on pull request #29642: URL: https://github.com/apache/spark/pull/29642#issuecomment-716188402 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #29642: [SPARK-32792][SQL] Improve in filter pushdown for ParquetFilters

2020-10-25 Thread GitBox
SparkQA commented on pull request #29642: URL: https://github.com/apache/spark/pull/29642#issuecomment-716190940 **[Test build #130245 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130245/testReport)** for PR 29642 at commit [`0169114`](https://github.co

[GitHub] [spark] AmplabJenkins commented on pull request #29642: [SPARK-32792][SQL] Improve in filter pushdown for ParquetFilters

2020-10-25 Thread GitBox
AmplabJenkins commented on pull request #29642: URL: https://github.com/apache/spark/pull/29642#issuecomment-716191230 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #29642: [SPARK-32792][SQL] Improve in filter pushdown for ParquetFilters

2020-10-25 Thread GitBox
SparkQA removed a comment on pull request #29642: URL: https://github.com/apache/spark/pull/29642#issuecomment-716153233 This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29642: [SPARK-32792][SQL] Improve in filter pushdown for ParquetFilters

2020-10-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29642: URL: https://github.com/apache/spark/pull/29642#issuecomment-716188402 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] c21 commented on pull request #30076: [SPARK-32862][SS] Left semi stream-stream join

2020-10-25 Thread GitBox
c21 commented on pull request #30076: URL: https://github.com/apache/spark/pull/30076#issuecomment-716207490 @viirya - gentle ping again, any more comments before merging? Thanks. This is an automated message from the Apache

[GitHub] [spark] HeartSaVioR commented on pull request #30076: [SPARK-32862][SS] Left semi stream-stream join

2020-10-25 Thread GitBox
HeartSaVioR commented on pull request #30076: URL: https://github.com/apache/spark/pull/30076#issuecomment-716227637 retest this, please This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [spark] SparkQA commented on pull request #30076: [SPARK-32862][SS] Left semi stream-stream join

2020-10-25 Thread GitBox
SparkQA commented on pull request #30076: URL: https://github.com/apache/spark/pull/30076#issuecomment-716228578 **[Test build #130246 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130246/testReport)** for PR 30076 at commit [`14871d9`](https://github.com

[GitHub] [spark] dongjoon-hyun closed pull request #30140: [SPARK-33228][SQL] Don't uncache data when replacing a view having the same logical plan

2020-10-25 Thread GitBox
dongjoon-hyun closed pull request #30140: URL: https://github.com/apache/spark/pull/30140 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] SparkQA commented on pull request #30076: [SPARK-32862][SS] Left semi stream-stream join

2020-10-25 Thread GitBox
SparkQA commented on pull request #30076: URL: https://github.com/apache/spark/pull/30076#issuecomment-716234041 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34846/ -

[GitHub] [spark] dongjoon-hyun closed pull request #30123: [SPARK-33234][INFRA] Generates SHA-512 using shasum

2020-10-25 Thread GitBox
dongjoon-hyun closed pull request #30123: URL: https://github.com/apache/spark/pull/30123 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] dongjoon-hyun commented on pull request #30123: [SPARK-33234][INFRA] Generates SHA-512 using shasum

2020-10-25 Thread GitBox
dongjoon-hyun commented on pull request #30123: URL: https://github.com/apache/spark/pull/30123#issuecomment-716236333 What is your JIRA id, @emilianbold ? This is an automated message from the Apache Git Service. To respond

[GitHub] [spark] emilianbold commented on pull request #30123: [SPARK-33234][INFRA] Generates SHA-512 using shasum

2020-10-25 Thread GitBox
emilianbold commented on pull request #30123: URL: https://github.com/apache/spark/pull/30123#issuecomment-716237452 Not sure what you mean, my Apache JIRA ID or what JIRA ID? Also, I wonder, what for? --emi On Mon, Oct 26, 2020 at 2:08 AM Dongjoon Hyun wrote:

[GitHub] [spark] SparkQA commented on pull request #30076: [SPARK-32862][SS] Left semi stream-stream join

2020-10-25 Thread GitBox
SparkQA commented on pull request #30076: URL: https://github.com/apache/spark/pull/30076#issuecomment-716238344 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34846/ ---

[GitHub] [spark] AmplabJenkins commented on pull request #30076: [SPARK-32862][SS] Left semi stream-stream join

2020-10-25 Thread GitBox
AmplabJenkins commented on pull request #30076: URL: https://github.com/apache/spark/pull/30076#issuecomment-716238353 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #30062: [SPARK-32916][SHUFFLE] Implementation of shuffle service that leverages push-based shuffle in YARN deployment mode

2020-10-25 Thread GitBox
SparkQA commented on pull request #30062: URL: https://github.com/apache/spark/pull/30062#issuecomment-716238477 **[Test build #130247 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130247/testReport)** for PR 30062 at commit [`4b6847f`](https://github.com

[GitHub] [spark] c21 commented on pull request #30076: [SPARK-32862][SS] Left semi stream-stream join

2020-10-25 Thread GitBox
c21 commented on pull request #30076: URL: https://github.com/apache/spark/pull/30076#issuecomment-716240797 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [spark] SparkQA commented on pull request #30076: [SPARK-32862][SS] Left semi stream-stream join

2020-10-25 Thread GitBox
SparkQA commented on pull request #30076: URL: https://github.com/apache/spark/pull/30076#issuecomment-716242211 **[Test build #130248 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130248/testReport)** for PR 30076 at commit [`14871d9`](https://github.com

[GitHub] [spark] otterc commented on a change in pull request #30062: [SPARK-32916][SHUFFLE] Implementation of shuffle service that leverages push-based shuffle in YARN deployment mode

2020-10-25 Thread GitBox
otterc commented on a change in pull request #30062: URL: https://github.com/apache/spark/pull/30062#discussion_r511673022 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -0,0 +1,899 @@ +/* + * Licensed to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30076: [SPARK-32862][SS] Left semi stream-stream join

2020-10-25 Thread GitBox
AmplabJenkins removed a comment on pull request #30076: URL: https://github.com/apache/spark/pull/30076#issuecomment-716238353 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] otterc commented on a change in pull request #30062: [SPARK-32916][SHUFFLE] Implementation of shuffle service that leverages push-based shuffle in YARN deployment mode

2020-10-25 Thread GitBox
otterc commented on a change in pull request #30062: URL: https://github.com/apache/spark/pull/30062#discussion_r511673540 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -0,0 +1,899 @@ +/* + * Licensed to

[GitHub] [spark] SparkQA commented on pull request #30130: [WIP][SPARK-32354][K8S][TESTS] Re-enable RTestsSuite

2020-10-25 Thread GitBox
SparkQA commented on pull request #30130: URL: https://github.com/apache/spark/pull/30130#issuecomment-716245270 **[Test build #130249 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130249/testReport)** for PR 30130 at commit [`01666ef`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #30066: [SPARK-XXX][INFRA] Use pre-built image at GitHub Action SparkR job

2020-10-25 Thread GitBox
SparkQA commented on pull request #30066: URL: https://github.com/apache/spark/pull/30066#issuecomment-716245284 **[Test build #130250 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130250/testReport)** for PR 30066 at commit [`1557556`](https://github.com

[GitHub] [spark] otterc commented on a change in pull request #30062: [SPARK-32916][SHUFFLE] Implementation of shuffle service that leverages push-based shuffle in YARN deployment mode

2020-10-25 Thread GitBox
otterc commented on a change in pull request #30062: URL: https://github.com/apache/spark/pull/30062#discussion_r511674349 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -0,0 +1,905 @@ +/* + * Licensed to

[GitHub] [spark] HyukjinKwon commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

2020-10-25 Thread GitBox
HyukjinKwon commented on pull request #30143: URL: https://github.com/apache/spark/pull/30143#issuecomment-716245920 Yeah, I think we can just keep docstrings as-is. For grouping stuff, I think it's okay to don't change since it's mainly for code readers. There were similar changes propose

[GitHub] [spark] HyukjinKwon commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

2020-10-25 Thread GitBox
HyukjinKwon commented on pull request #30143: URL: https://github.com/apache/spark/pull/30143#issuecomment-716246037 The current changes look pretty good to go. Let me know when you think it's ready. I'll take one more look and merge it in.

[GitHub] [spark] wangyum commented on pull request #29642: [SPARK-32792][SQL] Improve in filter pushdown for ParquetFilters

2020-10-25 Thread GitBox
wangyum commented on pull request #29642: URL: https://github.com/apache/spark/pull/29642#issuecomment-716246936 cc @cloud-fan @HyukjinKwon @gengliangwang This is an automated message from the Apache Git Service. To respond

[GitHub] [spark] github-actions[bot] commented on pull request #28994: [SPARK-32170][CORE] Improve the speculation for the inefficient tasks by the task metrics.

2020-10-25 Thread GitBox
github-actions[bot] commented on pull request #28994: URL: https://github.com/apache/spark/pull/28994#issuecomment-716247017 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue ma

[GitHub] [spark] otterc commented on a change in pull request #30062: [SPARK-32916][SHUFFLE] Implementation of shuffle service that leverages push-based shuffle in YARN deployment mode

2020-10-25 Thread GitBox
otterc commented on a change in pull request #30062: URL: https://github.com/apache/spark/pull/30062#discussion_r511675412 ## File path: common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java ## @@ -172,7 +178,9 @@ protected void serviceInit(Co

[GitHub] [spark] HyukjinKwon commented on a change in pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

2020-10-25 Thread GitBox
HyukjinKwon commented on a change in pull request #30143: URL: https://github.com/apache/spark/pull/30143#discussion_r511675622 ## File path: python/pyspark/sql/functions.py ## @@ -42,154 +42,455 @@ # since it requires to make every single overridden definition. -def _crea

[GitHub] [spark] SparkQA commented on pull request #30130: [WIP][SPARK-32354][K8S][TESTS] Re-enable RTestsSuite

2020-10-25 Thread GitBox
SparkQA commented on pull request #30130: URL: https://github.com/apache/spark/pull/30130#issuecomment-716248862 **[Test build #130249 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130249/testReport)** for PR 30130 at commit [`01666ef`](https://github.co

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30130: [WIP][SPARK-32354][K8S][TESTS] Re-enable RTestsSuite

2020-10-25 Thread GitBox
AmplabJenkins removed a comment on pull request #30130: URL: https://github.com/apache/spark/pull/30130#issuecomment-716248961 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA removed a comment on pull request #30130: [WIP][SPARK-32354][K8S][TESTS] Re-enable RTestsSuite

2020-10-25 Thread GitBox
SparkQA removed a comment on pull request #30130: URL: https://github.com/apache/spark/pull/30130#issuecomment-716245270 **[Test build #130249 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130249/testReport)** for PR 30130 at commit [`01666ef`](https://gi

[GitHub] [spark] SparkQA commented on pull request #30062: [SPARK-32916][SHUFFLE] Implementation of shuffle service that leverages push-based shuffle in YARN deployment mode

2020-10-25 Thread GitBox
SparkQA commented on pull request #30062: URL: https://github.com/apache/spark/pull/30062#issuecomment-716249075 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34847/ -

[GitHub] [spark] AmplabJenkins commented on pull request #30130: [WIP][SPARK-32354][K8S][TESTS] Re-enable RTestsSuite

2020-10-25 Thread GitBox
AmplabJenkins commented on pull request #30130: URL: https://github.com/apache/spark/pull/30130#issuecomment-716248961 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #30076: [SPARK-32862][SS] Left semi stream-stream join

2020-10-25 Thread GitBox
SparkQA commented on pull request #30076: URL: https://github.com/apache/spark/pull/30076#issuecomment-716251124 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34848/ -

[GitHub] [spark] HyukjinKwon commented on pull request #29818: [SPARK-32953][PYTHON] Add Arrow self_destruct support to toPandas

2020-10-25 Thread GitBox
HyukjinKwon commented on pull request #29818: URL: https://github.com/apache/spark/pull/29818#issuecomment-716252237 @iidavidm, just out of curiosity, we can do it in pandas UDFs too, right? I can of course do it separately in another PR but I was wondering if I am understanding correctly.

[GitHub] [spark] zhengruifeng commented on pull request #30009: [SPARK-32907][ML] adaptively blockify instances - LinearSVC

2020-10-25 Thread GitBox
zhengruifeng commented on pull request #30009: URL: https://github.com/apache/spark/pull/30009#issuecomment-716252855 retest this please This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [spark] SparkQA commented on pull request #30009: [SPARK-32907][ML] adaptively blockify instances - LinearSVC

2020-10-25 Thread GitBox
SparkQA commented on pull request #30009: URL: https://github.com/apache/spark/pull/30009#issuecomment-716253408 **[Test build #130251 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130251/testReport)** for PR 30009 at commit [`fc1bc87`](https://github.com

[GitHub] [spark] lidavidm commented on pull request #29818: [SPARK-32953][PYTHON] Add Arrow self_destruct support to toPandas

2020-10-25 Thread GitBox
lidavidm commented on pull request #29818: URL: https://github.com/apache/spark/pull/29818#issuecomment-716253285 @HyukjinKwon hmm, it should also work there, but I can confirm (I haven't looked at that codepath/can measure the memory usage there too). It will only work one way though (Spa

[GitHub] [spark] HyukjinKwon edited a comment on pull request #29818: [SPARK-32953][PYTHON] Add Arrow self_destruct support to toPandas

2020-10-25 Thread GitBox
HyukjinKwon edited a comment on pull request #29818: URL: https://github.com/apache/spark/pull/29818#issuecomment-716252237 @iidavidm, just out of curiosity, we can do it in pandas UDFs too, right? It can be of course done separately in another PR but I was wondering if I am understanding

[GitHub] [spark] SparkQA commented on pull request #30062: [SPARK-32916][SHUFFLE] Implementation of shuffle service that leverages push-based shuffle in YARN deployment mode

2020-10-25 Thread GitBox
SparkQA commented on pull request #30062: URL: https://github.com/apache/spark/pull/30062#issuecomment-716254020 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34847/ ---

[GitHub] [spark] AmplabJenkins commented on pull request #30062: [SPARK-32916][SHUFFLE] Implementation of shuffle service that leverages push-based shuffle in YARN deployment mode

2020-10-25 Thread GitBox
AmplabJenkins commented on pull request #30062: URL: https://github.com/apache/spark/pull/30062#issuecomment-716254033 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #30066: [SPARK-XXX][INFRA] Use pre-built image at GitHub Action SparkR job

2020-10-25 Thread GitBox
SparkQA commented on pull request #30066: URL: https://github.com/apache/spark/pull/30066#issuecomment-716254525 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34850/ -

[GitHub] [spark] HyukjinKwon commented on pull request #29818: [SPARK-32953][PYTHON] Add Arrow self_destruct support to toPandas

2020-10-25 Thread GitBox
HyukjinKwon commented on pull request #29818: URL: https://github.com/apache/spark/pull/29818#issuecomment-716254744 Ah, that's fine. That one, we can just do it separately. We might have to use self-destruct at `ArrowStreamPandasSerializer.arrow_to_pandas` to make it working but sure ther

[GitHub] [spark] SparkQA commented on pull request #30130: [WIP][SPARK-32354][K8S][TESTS] Re-enable RTestsSuite

2020-10-25 Thread GitBox
SparkQA commented on pull request #30130: URL: https://github.com/apache/spark/pull/30130#issuecomment-716256012 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34849/ -

[GitHub] [spark] HyukjinKwon commented on pull request #30123: [SPARK-33234][INFRA] Generates SHA-512 using shasum

2020-10-25 Thread GitBox
HyukjinKwon commented on pull request #30123: URL: https://github.com/apache/spark/pull/30123#issuecomment-716257191 @emilianbold, Apache JIRA ID because the ticket (SPARK-33234) has to be assigned properly to you. This is a

[GitHub] [spark] SparkQA commented on pull request #30076: [SPARK-32862][SS] Left semi stream-stream join

2020-10-25 Thread GitBox
SparkQA commented on pull request #30076: URL: https://github.com/apache/spark/pull/30076#issuecomment-716257746 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34848/ ---

[GitHub] [spark] AmplabJenkins commented on pull request #30076: [SPARK-32862][SS] Left semi stream-stream join

2020-10-25 Thread GitBox
AmplabJenkins commented on pull request #30076: URL: https://github.com/apache/spark/pull/30076#issuecomment-716257756 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30062: [SPARK-32916][SHUFFLE] Implementation of shuffle service that leverages push-based shuffle in YARN deployment mode

2020-10-25 Thread GitBox
AmplabJenkins removed a comment on pull request #30062: URL: https://github.com/apache/spark/pull/30062#issuecomment-716254033 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30076: [SPARK-32862][SS] Left semi stream-stream join

2020-10-25 Thread GitBox
AmplabJenkins removed a comment on pull request #30076: URL: https://github.com/apache/spark/pull/30076#issuecomment-716257756 This is an automated message from the Apache Git Service. To respond to the message, please log on

  1   2   3   >