[GitHub] [spark] SparkQA commented on pull request #33076: [SPARK-35889][SQL] Support adding TimestampWithoutTZ with Interval types

2021-06-25 Thread GitBox
SparkQA commented on pull request #33076: URL: https://github.com/apache/spark/pull/33076#issuecomment-868237781 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44832/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #32767: [SPARK-35628][SS] RocksDBFileManager - load checkpoint from DFS

2021-06-25 Thread GitBox
SparkQA commented on pull request #32767: URL: https://github.com/apache/spark/pull/32767#issuecomment-868260410 **[Test build #140300 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140300/testReport)** for PR 32767 at commit

[GitHub] [spark] SparkQA commented on pull request #33077: [SPARK-34892][SS] Introduce MergingSortWithSessionWindowStateIterator sorting input rows and rows in state efficiently

2021-06-25 Thread GitBox
SparkQA commented on pull request #33077: URL: https://github.com/apache/spark/pull/33077#issuecomment-868263221 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44837/ -- This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33075: [SPARK-35887][BUILD] Find and set JAVA_HOME from javac location

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33075: URL: https://github.com/apache/spark/pull/33075#issuecomment-868262278 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140298/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33054: URL: https://github.com/apache/spark/pull/33054#issuecomment-868262286 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140304/

[GitHub] [spark] SparkQA removed a comment on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-25 Thread GitBox
SparkQA removed a comment on pull request #33054: URL: https://github.com/apache/spark/pull/33054#issuecomment-868219290 **[Test build #140304 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140304/testReport)** for PR 33054 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #32767: [SPARK-35628][SS] RocksDBFileManager - load checkpoint from DFS

2021-06-25 Thread GitBox
SparkQA removed a comment on pull request #32767: URL: https://github.com/apache/spark/pull/32767#issuecomment-868179234 **[Test build #140300 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140300/testReport)** for PR 32767 at commit

[GitHub] [spark] SparkQA commented on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-25 Thread GitBox
SparkQA commented on pull request #33054: URL: https://github.com/apache/spark/pull/33054#issuecomment-868265537 **[Test build #140308 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140308/testReport)** for PR 33054 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33076: [SPARK-35889][SQL] Support adding TimestampWithoutTZ with Interval types

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33076: URL: https://github.com/apache/spark/pull/33076#issuecomment-868262281 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44832/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32767: [SPARK-35628][SS] RocksDBFileManager - load checkpoint from DFS

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #32767: URL: https://github.com/apache/spark/pull/32767#issuecomment-868262279 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] SparkQA removed a comment on pull request #33075: [SPARK-35887][BUILD] Find and set JAVA_HOME from javac location

2021-06-25 Thread GitBox
SparkQA removed a comment on pull request #33075: URL: https://github.com/apache/spark/pull/33075#issuecomment-868179023 **[Test build #140298 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140298/testReport)** for PR 33075 at commit

[GitHub] [spark] zhouyejoe opened a new pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the state in a bett

2021-06-25 Thread GitBox
zhouyejoe opened a new pull request #33078: URL: https://github.com/apache/spark/pull/33078 ### What changes were proposed in this pull request? This is one of the patches for SPIP SPARK-30602 which is needed for push-based shuffle. ### Summary of the change: When Executor

[GitHub] [spark] dongjoon-hyun commented on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1

2021-06-25 Thread GitBox
dongjoon-hyun commented on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-868284598 Thank you for sharing, @arghya18 . HADOOP-17755 sounds like read-side issue and Magic committer is write-side feature. I don't think they are related. If you hit a

[GitHub] [spark] SparkQA commented on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-25 Thread GitBox
SparkQA commented on pull request #33054: URL: https://github.com/apache/spark/pull/33054#issuecomment-868251582 **[Test build #140304 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140304/testReport)** for PR 33054 at commit

[GitHub] [spark] SparkQA commented on pull request #33075: [SPARK-35887][BUILD] Find and set JAVA_HOME from javac location

2021-06-25 Thread GitBox
SparkQA commented on pull request #33075: URL: https://github.com/apache/spark/pull/33075#issuecomment-868260910 **[Test build #140298 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140298/testReport)** for PR 33075 at commit

[GitHub] [spark] SparkQA commented on pull request #33076: [SPARK-35889][SQL] Support adding TimestampWithoutTZ with Interval types

2021-06-25 Thread GitBox
SparkQA commented on pull request #33076: URL: https://github.com/apache/spark/pull/33076#issuecomment-868260792 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44834/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33063: [SPARK-35879][Core][Shuffle] Fix performance regression caused by collectFetchRequests

2021-06-25 Thread GitBox
SparkQA commented on pull request #33063: URL: https://github.com/apache/spark/pull/33063#issuecomment-868263810 **[Test build #140307 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140307/testReport)** for PR 33063 at commit

[GitHub] [spark] mridulm commented on a change in pull request #33063: [SPARK-35879][Core][Shuffle] Fix performance regression caused by collectFetchRequests

2021-06-25 Thread GitBox
mridulm commented on a change in pull request #33063: URL: https://github.com/apache/spark/pull/33063#discussion_r658519592 ## File path: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ## @@ -397,21 +400,21 @@ final class

[GitHub] [spark] ulysses-you commented on a change in pull request #33079: [SPARK-35888][SQL] Add dataSize field in CoalescedPartitionSpec

2021-06-25 Thread GitBox
ulysses-you commented on a change in pull request #33079: URL: https://github.com/apache/spark/pull/33079#discussion_r658527265 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/ShufflePartitionsUtilSuite.scala ## @@ -46,45 +49,42 @@ class

[GitHub] [spark] ulysses-you commented on a change in pull request #33079: [SPARK-35888][SQL] Add dataSize field in CoalescedPartitionSpec

2021-06-25 Thread GitBox
ulysses-you commented on a change in pull request #33079: URL: https://github.com/apache/spark/pull/33079#discussion_r658528470 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/ShufflePartitionsUtilSuite.scala ## @@ -36,7 +36,10 @@ class

[GitHub] [spark] SparkQA commented on pull request #32767: [SPARK-35628][SS] RocksDBFileManager - load checkpoint from DFS

2021-06-25 Thread GitBox
SparkQA commented on pull request #32767: URL: https://github.com/apache/spark/pull/32767#issuecomment-868261445 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44836/ --

[GitHub] [spark] AmplabJenkins commented on pull request #33076: [SPARK-35889][SQL] Support adding TimestampWithoutTZ with Interval types

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33076: URL: https://github.com/apache/spark/pull/33076#issuecomment-868262281 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44832/ --

[GitHub] [spark] AmplabJenkins commented on pull request #32767: [SPARK-35628][SS] RocksDBFileManager - load checkpoint from DFS

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #32767: URL: https://github.com/apache/spark/pull/32767#issuecomment-868262279 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] AmplabJenkins commented on pull request #33075: [SPARK-35887][BUILD] Find and set JAVA_HOME from javac location

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33075: URL: https://github.com/apache/spark/pull/33075#issuecomment-868262278 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140298/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33054: URL: https://github.com/apache/spark/pull/33054#issuecomment-868262286 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140304/ -- This

[GitHub] [spark] SparkQA removed a comment on pull request #33076: [SPARK-35889][SQL] Support adding TimestampWithoutTZ with Interval types

2021-06-25 Thread GitBox
SparkQA removed a comment on pull request #33076: URL: https://github.com/apache/spark/pull/33076#issuecomment-868198475 **[Test build #140301 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140301/testReport)** for PR 33076 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33076: [SPARK-35889][SQL] Support adding TimestampWithoutTZ with Interval types

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33076: URL: https://github.com/apache/spark/pull/33076#issuecomment-868267868 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140301/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33054: URL: https://github.com/apache/spark/pull/33054#issuecomment-868267572 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44833/

[GitHub] [spark] AmplabJenkins commented on pull request #33076: [SPARK-35889][SQL] Support adding TimestampWithoutTZ with Interval types

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33076: URL: https://github.com/apache/spark/pull/33076#issuecomment-868267868 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140301/ -- This

[GitHub] [spark] mridulm commented on a change in pull request #33063: [SPARK-35879][Core][Shuffle] Fix performance regression caused by collectFetchRequests

2021-06-25 Thread GitBox
mridulm commented on a change in pull request #33063: URL: https://github.com/apache/spark/pull/33063#discussion_r658522658 ## File path: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ## @@ -433,28 +436,25 @@ final class

[GitHub] [spark] viirya commented on pull request #33067: [SPARK-35884][SQL] EXPLAIN FORMATTED for AQE

2021-06-25 Thread GitBox
viirya commented on pull request #33067: URL: https://github.com/apache/spark/pull/33067#issuecomment-868281666 Thanks! Merging to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] EnricoMi commented on a change in pull request #31905: [SPARK-34806][SQL] Add Observation helper for Dataset.observe

2021-06-25 Thread GitBox
EnricoMi commented on a change in pull request #31905: URL: https://github.com/apache/spark/pull/31905#discussion_r658534345 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Observation.scala ## @@ -0,0 +1,170 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] SparkQA commented on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-25 Thread GitBox
SparkQA commented on pull request #33054: URL: https://github.com/apache/spark/pull/33054#issuecomment-868251072 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44833/ -- This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins commented on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33054: URL: https://github.com/apache/spark/pull/33054#issuecomment-868267572 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44833/ --

[GitHub] [spark] SparkQA commented on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-25 Thread GitBox
SparkQA commented on pull request #33054: URL: https://github.com/apache/spark/pull/33054#issuecomment-868267548 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44833/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #33076: [SPARK-35889][SQL] Support adding TimestampWithoutTZ with Interval types

2021-06-25 Thread GitBox
SparkQA commented on pull request #33076: URL: https://github.com/apache/spark/pull/33076#issuecomment-868267562 **[Test build #140301 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140301/testReport)** for PR 33076 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the state in a

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33078: URL: https://github.com/apache/spark/pull/33078#issuecomment-868269773 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] zhouyejoe commented on pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the state in a bett

2021-06-25 Thread GitBox
zhouyejoe commented on pull request #33078: URL: https://github.com/apache/spark/pull/33078#issuecomment-868270264 PR created. CC @Ngone51 @mridulm @Victsm @otterc -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] SparkQA commented on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-25 Thread GitBox
SparkQA commented on pull request #33054: URL: https://github.com/apache/spark/pull/33054#issuecomment-868274094 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44835/ -- This is an automated message from the

[GitHub] [spark] viirya commented on a change in pull request #33067: [SPARK-35884][SQL] EXPLAIN FORMATTED for AQE

2021-06-25 Thread GitBox
viirya commented on a change in pull request #33067: URL: https://github.com/apache/spark/pull/33067#discussion_r658523194 ## File path: sql/core/src/test/scala/org/apache/spark/sql/ExplainSuite.scala ## @@ -547,30 +547,107 @@ class ExplainSuiteAE extends ExplainSuiteHelper

[GitHub] [spark] mridulm commented on a change in pull request #33063: [SPARK-35879][Core][Shuffle] Fix performance regression caused by collectFetchRequests

2021-06-25 Thread GitBox
mridulm commented on a change in pull request #33063: URL: https://github.com/apache/spark/pull/33063#discussion_r658522658 ## File path: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ## @@ -433,28 +436,25 @@ final class

[GitHub] [spark] SparkQA commented on pull request #33077: [SPARK-34892][SS] Introduce MergingSortWithSessionWindowStateIterator sorting input rows and rows in state efficiently

2021-06-25 Thread GitBox
SparkQA commented on pull request #33077: URL: https://github.com/apache/spark/pull/33077#issuecomment-868279983 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44837/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-25 Thread GitBox
SparkQA commented on pull request #33054: URL: https://github.com/apache/spark/pull/33054#issuecomment-868281740 **[Test build #140308 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140308/testReport)** for PR 33054 at commit

[GitHub] [spark] viirya closed pull request #33067: [SPARK-35884][SQL] EXPLAIN FORMATTED for AQE

2021-06-25 Thread GitBox
viirya closed pull request #33067: URL: https://github.com/apache/spark/pull/33067 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [spark] yaooqinn commented on a change in pull request #33063: [SPARK-35879][Core][Shuffle] Fix performance regression caused by collectFetchRequests

2021-06-25 Thread GitBox
yaooqinn commented on a change in pull request #33063: URL: https://github.com/apache/spark/pull/33063#discussion_r658533727 ## File path: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ## @@ -433,28 +436,25 @@ final class

[GitHub] [spark] SparkQA commented on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-25 Thread GitBox
SparkQA commented on pull request #33054: URL: https://github.com/apache/spark/pull/33054#issuecomment-868259675 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44835/ -- This is an automated message from the Apache

[GitHub] [spark] itholic edited a comment on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-25 Thread GitBox
itholic edited a comment on pull request #33054: URL: https://github.com/apache/spark/pull/33054#issuecomment-868201074 ~~Seems like it's a known bug in mypy (https://github.com/python/mypy/issues/1153).~~ ~~I'd ignore this case for now.~~ -- This is an automated message from the

[GitHub] [spark] itholic commented on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-25 Thread GitBox
itholic commented on pull request #33054: URL: https://github.com/apache/spark/pull/33054#issuecomment-868264170 > Ignoring is fine but are you sure if this is the same issue? In our case, we're importing it without try-except: > >

[GitHub] [spark] ulysses-you opened a new pull request #33079: [SPARK-35888][SQL] Add dataSize field in CoalescedPartitionSpec

2021-06-25 Thread GitBox
ulysses-you opened a new pull request #33079: URL: https://github.com/apache/spark/pull/33079 ### What changes were proposed in this pull request? * add `dataSize` field in `CoalescedPartitionSpec` * add data size test suite in `ShufflePartitionsUtilSuite` ### Why

[GitHub] [spark] SparkQA commented on pull request #33076: [SPARK-35889][SQL] Support adding TimestampWithoutTZ with Interval types

2021-06-25 Thread GitBox
SparkQA commented on pull request #33076: URL: https://github.com/apache/spark/pull/33076#issuecomment-868277928 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44834/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #33063: [SPARK-35879][Core][Shuffle] Fix performance regression caused by collectFetchRequests

2021-06-25 Thread GitBox
SparkQA commented on pull request #33063: URL: https://github.com/apache/spark/pull/33063#issuecomment-868288071 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44838/ -- This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins commented on pull request #33076: [SPARK-35889][SQL] Support adding TimestampWithoutTZ with Interval types

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33076: URL: https://github.com/apache/spark/pull/33076#issuecomment-868291723 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44834/ --

[GitHub] [spark] AmplabJenkins commented on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33054: URL: https://github.com/apache/spark/pull/33054#issuecomment-868291727 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] AmplabJenkins commented on pull request #33077: [SPARK-34892][SS] Introduce MergingSortWithSessionWindowStateIterator sorting input rows and rows in state efficiently

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33077: URL: https://github.com/apache/spark/pull/33077#issuecomment-868291726 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44837/ --

[GitHub] [spark] SparkQA commented on pull request #33076: [SPARK-35889][SQL] Support adding TimestampWithoutTZ with Interval types

2021-06-25 Thread GitBox
SparkQA commented on pull request #33076: URL: https://github.com/apache/spark/pull/33076#issuecomment-868293726 **[Test build #140310 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140310/testReport)** for PR 33076 at commit

[GitHub] [spark] EnricoMi commented on a change in pull request #31905: [SPARK-34806][SQL] Add Observation helper for Dataset.observe

2021-06-25 Thread GitBox
EnricoMi commented on a change in pull request #31905: URL: https://github.com/apache/spark/pull/31905#discussion_r658553258 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Observation.scala ## @@ -0,0 +1,170 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] SparkQA commented on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-25 Thread GitBox
SparkQA commented on pull request #33054: URL: https://github.com/apache/spark/pull/33054#issuecomment-868302420 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44840/ --

[GitHub] [spark] mridulm commented on a change in pull request #33063: [SPARK-35879][Core][Shuffle] Fix performance regression caused by collectFetchRequests

2021-06-25 Thread GitBox
mridulm commented on a change in pull request #33063: URL: https://github.com/apache/spark/pull/33063#discussion_r658558309 ## File path: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ## @@ -433,28 +436,25 @@ final class

[GitHub] [spark] HeartSaVioR opened a new pull request #33081: [SPARK-34893][SS] Support session window natively

2021-06-25 Thread GitBox
HeartSaVioR opened a new pull request #33081: URL: https://github.com/apache/spark/pull/33081 ### What changes were proposed in this pull request? This PR proposes to support native session window. Please refer the comments/design doc in SPARK-10816 for more details on the

[GitHub] [spark] SparkQA commented on pull request #32767: [SPARK-35628][SS] RocksDBFileManager - load checkpoint from DFS

2021-06-25 Thread GitBox
SparkQA commented on pull request #32767: URL: https://github.com/apache/spark/pull/32767#issuecomment-868324671 **[Test build #140305 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140305/testReport)** for PR 32767 at commit

[GitHub] [spark] SparkQA commented on pull request #32832: [SPARK-35686][SQL] Not allow using auto-generated alias when creating view

2021-06-25 Thread GitBox
SparkQA commented on pull request #32832: URL: https://github.com/apache/spark/pull/32832#issuecomment-868327194 **[Test build #140317 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140317/testReport)** for PR 32832 at commit

[GitHub] [spark] SparkQA commented on pull request #33079: [SPARK-35888][SQL] Add dataSize field in CoalescedPartitionSpec

2021-06-25 Thread GitBox
SparkQA commented on pull request #33079: URL: https://github.com/apache/spark/pull/33079#issuecomment-868330280 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44841/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33038: [SPARK-35861][SS] Introduce "prefix match scan" feature on state store

2021-06-25 Thread GitBox
SparkQA commented on pull request #33038: URL: https://github.com/apache/spark/pull/33038#issuecomment-868330474 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44844/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33077: [SPARK-34892][SS] Introduce MergingSortWithSessionWindowStateIterator sorting input rows and rows in state efficiently

2021-06-25 Thread GitBox
SparkQA commented on pull request #33077: URL: https://github.com/apache/spark/pull/33077#issuecomment-868359697 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44843/ -- This is an automated message from the

[GitHub] [spark] HeartSaVioR commented on pull request #32767: [SPARK-35628][SS] RocksDBFileManager - load checkpoint from DFS

2021-06-25 Thread GitBox
HeartSaVioR commented on pull request #32767: URL: https://github.com/apache/spark/pull/32767#issuecomment-868373147 Thanks @xuanyuanking for the contribution! I merged into master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #32832: [SPARK-35686][SQL] Not allow using auto-generated alias when creating view

2021-06-25 Thread GitBox
SparkQA removed a comment on pull request #32832: URL: https://github.com/apache/spark/pull/32832#issuecomment-868327194 **[Test build #140317 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140317/testReport)** for PR 32832 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #32832: [SPARK-35686][SQL] Not allow using auto-generated alias when creating view

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #32832: URL: https://github.com/apache/spark/pull/32832#issuecomment-868379883 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140317/ -- This

[GitHub] [spark] SparkQA commented on pull request #32832: [SPARK-35686][SQL] Not allow using auto-generated alias when creating view

2021-06-25 Thread GitBox
SparkQA commented on pull request #32832: URL: https://github.com/apache/spark/pull/32832#issuecomment-868379351 **[Test build #140317 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140317/testReport)** for PR 32832 at commit

[GitHub] [spark] SparkQA commented on pull request #33063: [SPARK-35879][Core][Shuffle] Fix performance regression caused by collectFetchRequests

2021-06-25 Thread GitBox
SparkQA commented on pull request #33063: URL: https://github.com/apache/spark/pull/33063#issuecomment-868383956 **[Test build #140320 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140320/testReport)** for PR 33063 at commit

[GitHub] [spark] ulysses-you commented on a change in pull request #32883: [SPARK-35725][SQL] Support repartition expand partitions in AQE

2021-06-25 Thread GitBox
ulysses-you commented on a change in pull request #32883: URL: https://github.com/apache/spark/pull/32883#discussion_r658646852 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -642,6 +642,15 @@ object SQLConf {

[GitHub] [spark] SparkQA commented on pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the state in a better

2021-06-25 Thread GitBox
SparkQA commented on pull request #33078: URL: https://github.com/apache/spark/pull/33078#issuecomment-868391713 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44847/ -- This is an automated message from the

[GitHub] [spark] gengliangwang closed pull request #33072: [SPARK-35817][SQL][3.1] Restore performance of queries against wide Avro tables

2021-06-25 Thread GitBox
gengliangwang closed pull request #33072: URL: https://github.com/apache/spark/pull/33072 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [spark] SparkQA commented on pull request #33076: [SPARK-35889][SQL] Support adding TimestampWithoutTZ with Interval types

2021-06-25 Thread GitBox
SparkQA commented on pull request #33076: URL: https://github.com/apache/spark/pull/33076#issuecomment-868404141 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44850/ -- This is an automated message from the Apache

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-25 Thread GitBox
HyukjinKwon commented on a change in pull request #33054: URL: https://github.com/apache/spark/pull/33054#discussion_r658671949 ## File path: python/pyspark/pandas/plot/core.py ## @@ -20,7 +20,7 @@ import pandas as pd import numpy as np from pyspark.ml.feature import

[GitHub] [spark] SparkQA removed a comment on pull request #33079: [SPARK-35888][SQL] Add dataSize field in CoalescedPartitionSpec

2021-06-25 Thread GitBox
SparkQA removed a comment on pull request #33079: URL: https://github.com/apache/spark/pull/33079#issuecomment-868293722 **[Test build #140309 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140309/testReport)** for PR 33079 at commit

[GitHub] [spark] SparkQA commented on pull request #32832: [SPARK-35686][SQL] Not allow using auto-generated alias when creating view

2021-06-25 Thread GitBox
SparkQA commented on pull request #32832: URL: https://github.com/apache/spark/pull/32832#issuecomment-868420400 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44849/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #33082: [SPARK-35886][SQL] CodeGenerator.getLocalInputVariableValues should handle matched subQuery but not VariableValue

2021-06-25 Thread GitBox
SparkQA commented on pull request #33082: URL: https://github.com/apache/spark/pull/33082#issuecomment-868424440 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44851/ -- This is an automated message from the

[GitHub] [spark] HeartSaVioR commented on pull request #32767: [SPARK-35628][SS] RocksDBFileManager - load checkpoint from DFS

2021-06-25 Thread GitBox
HeartSaVioR commented on pull request #32767: URL: https://github.com/apache/spark/pull/32767#issuecomment-868431278 UPDATE: we found a consistent break on Scala 2.13 build caused by this. @xuanyuanking is working on the fix so please allow us some time to fix it as follow-up PR instead

[GitHub] [spark] SparkQA commented on pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the state in a better

2021-06-25 Thread GitBox
SparkQA commented on pull request #33078: URL: https://github.com/apache/spark/pull/33078#issuecomment-868434122 **[Test build #140315 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140315/testReport)** for PR 33078 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #33063: [SPARK-35879][Core][Shuffle] Fix performance regression caused by collectFetchRequests

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33063: URL: https://github.com/apache/spark/pull/33063#issuecomment-868437504 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44852/ --

[GitHub] [spark] AmplabJenkins commented on pull request #33076: [SPARK-35889][SQL] Support adding TimestampWithoutTZ with Interval types

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33076: URL: https://github.com/apache/spark/pull/33076#issuecomment-868437498 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44850/ --

[GitHub] [spark] AmplabJenkins commented on pull request #33082: [SPARK-35886][SQL] CodeGenerator.getLocalInputVariableValues should handle matched subQuery but not VariableValue

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33082: URL: https://github.com/apache/spark/pull/33082#issuecomment-868437497 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44851/ --

[GitHub] [spark] AmplabJenkins commented on pull request #32832: [SPARK-35686][SQL] Not allow using auto-generated alias when creating view

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #32832: URL: https://github.com/apache/spark/pull/32832#issuecomment-868437496 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44849/ --

[GitHub] [spark] AmplabJenkins commented on pull request #33077: [SPARK-34892][SS] Introduce MergingSortWithSessionWindowStateIterator sorting input rows and rows in state efficiently

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33077: URL: https://github.com/apache/spark/pull/33077#issuecomment-868437499 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140306/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the state in a

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33078: URL: https://github.com/apache/spark/pull/33078#issuecomment-868437501 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140315/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33079: [SPARK-35888][SQL] Add dataSize field in CoalescedPartitionSpec

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33079: URL: https://github.com/apache/spark/pull/33079#issuecomment-868437500 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140309/ -- This

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33076: [SPARK-35889][SQL] Support adding TimestampWithoutTZ with Interval types

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33076: URL: https://github.com/apache/spark/pull/33076#issuecomment-868437498 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44850/

[GitHub] [spark] arghya18 commented on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1

2021-06-25 Thread GitBox
arghya18 commented on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-868289618 @dongjoon-hyun Thanks for your response. Yes I understand magic committer is not related, I just wanted to understand if I build Spark 3.1.1 with Hadoop 3.3.0/3.3.1 will the

[GitHub] [spark] SparkQA commented on pull request #31989: [SPARK-34891][SS] Introduce state store manager for session window in streaming query

2021-06-25 Thread GitBox
SparkQA commented on pull request #31989: URL: https://github.com/apache/spark/pull/31989#issuecomment-868296368 **[Test build #140313 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140313/testReport)** for PR 31989 at commit

[GitHub] [spark] mridulm commented on pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the state in a better

2021-06-25 Thread GitBox
mridulm commented on pull request #33078: URL: https://github.com/apache/spark/pull/33078#issuecomment-868305928 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] mridulm commented on a change in pull request #33063: [SPARK-35879][Core][Shuffle] Fix performance regression caused by collectFetchRequests

2021-06-25 Thread GitBox
mridulm commented on a change in pull request #33063: URL: https://github.com/apache/spark/pull/33063#discussion_r658558309 ## File path: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ## @@ -433,28 +436,25 @@ final class

[GitHub] [spark] AmplabJenkins commented on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33054: URL: https://github.com/apache/spark/pull/33054#issuecomment-868326320 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] AmplabJenkins commented on pull request #33063: [SPARK-35879][Core][Shuffle] Fix performance regression caused by collectFetchRequests

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33063: URL: https://github.com/apache/spark/pull/33063#issuecomment-868326323 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44838/ --

[GitHub] [spark] AmplabJenkins commented on pull request #32767: [SPARK-35628][SS] RocksDBFileManager - load checkpoint from DFS

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #32767: URL: https://github.com/apache/spark/pull/32767#issuecomment-868326321 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140305/ -- This

[GitHub] [spark] SparkQA commented on pull request #33063: [SPARK-35879][Core][Shuffle] Fix performance regression caused by collectFetchRequests

2021-06-25 Thread GitBox
SparkQA commented on pull request #33063: URL: https://github.com/apache/spark/pull/33063#issuecomment-868327737 **[Test build #140307 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140307/testReport)** for PR 33063 at commit

[GitHub] [spark] SparkQA commented on pull request #33076: [SPARK-35889][SQL] Support adding TimestampWithoutTZ with Interval types

2021-06-25 Thread GitBox
SparkQA commented on pull request #33076: URL: https://github.com/apache/spark/pull/33076#issuecomment-868360216 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44842/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the state in a better

2021-06-25 Thread GitBox
SparkQA commented on pull request #33078: URL: https://github.com/apache/spark/pull/33078#issuecomment-868369375 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44847/ -- This is an automated message from the Apache

[GitHub] [spark] yaooqinn commented on a change in pull request #32931: [SPARK-33898][SQL] Support SHOW CREATE TABLE In V2

2021-06-25 Thread GitBox
yaooqinn commented on a change in pull request #32931: URL: https://github.com/apache/spark/pull/32931#discussion_r658647119 ## File path: sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala ## @@ -1961,12 +1961,98 @@ class DataSourceV2SQLSuite

[GitHub] [spark] ulysses-you commented on a change in pull request #32883: [SPARK-35725][SQL] Support repartition expand partitions in AQE

2021-06-25 Thread GitBox
ulysses-you commented on a change in pull request #32883: URL: https://github.com/apache/spark/pull/32883#discussion_r658647370 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala ## @@ -99,6 +99,7 @@ case class

[GitHub] [spark] ulysses-you commented on a change in pull request #32883: [SPARK-35725][SQL] Support repartition expand partitions in AQE

2021-06-25 Thread GitBox
ulysses-you commented on a change in pull request #32883: URL: https://github.com/apache/spark/pull/32883#discussion_r658647513 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/ExpandShufflePartitions.scala ## @@ -0,0 +1,98 @@ +/* + * Licensed to

  1   2   3   4   5   6   7   >