[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.

2020-09-09 Thread GitBox
AmplabJenkins removed a comment on pull request #29626: URL: https://github.com/apache/spark/pull/29626#issuecomment-690004634 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.

2020-09-09 Thread GitBox
AmplabJenkins commented on pull request #29626: URL: https://github.com/apache/spark/pull/29626#issuecomment-690004634 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] beliefer commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.

2020-09-09 Thread GitBox
beliefer commented on pull request #29626: URL: https://github.com/apache/spark/pull/29626#issuecomment-690004119 retest this please This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.

2020-09-09 Thread GitBox
AmplabJenkins removed a comment on pull request #29626: URL: https://github.com/apache/spark/pull/29626#issuecomment-690003006 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.

2020-09-09 Thread GitBox
AmplabJenkins commented on pull request #29626: URL: https://github.com/apache/spark/pull/29626#issuecomment-690002992 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.

2020-09-09 Thread GitBox
AmplabJenkins removed a comment on pull request #29626: URL: https://github.com/apache/spark/pull/29626#issuecomment-690002992 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.

2020-09-09 Thread GitBox
SparkQA removed a comment on pull request #29626: URL: https://github.com/apache/spark/pull/29626#issuecomment-689941100 **[Test build #128479 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128479/testReport)** for PR 29626 at commit

[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.

2020-09-09 Thread GitBox
SparkQA commented on pull request #29626: URL: https://github.com/apache/spark/pull/29626#issuecomment-690002547 **[Test build #128479 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128479/testReport)** for PR 29626 at commit

[GitHub] [spark] mridulm commented on a change in pull request #28618: [SPARK-31801][API][SHUFFLE] Register map output metadata

2020-09-09 Thread GitBox
mridulm commented on a change in pull request #28618: URL: https://github.com/apache/spark/pull/28618#discussion_r486080456 ## File path: core/src/main/java/org/apache/spark/shuffle/sort/io/LocalDiskShuffleExecutorComponents.java ## @@ -17,69 +17,64 @@ package

[GitHub] [spark] KevinSmile edited a comment on pull request #29653: [SPARK-32804][Launcher] Fix run-example command builder bug

2020-09-09 Thread GitBox
KevinSmile edited a comment on pull request #29653: URL: https://github.com/apache/spark/pull/29653#issuecomment-689991002 It seems that `BlockManagerDecommissionIntegrationSuite` is unstable, as it passed in a previous ci but failed this time.

[GitHub] [spark] viirya commented on a change in pull request #29703: [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant available in PyPI

2020-09-09 Thread GitBox
viirya commented on a change in pull request #29703: URL: https://github.com/apache/spark/pull/29703#discussion_r486078357 ## File path: python/setup.py ## @@ -16,14 +16,19 @@ # See the License for the specific language governing permissions and # limitations under the

[GitHub] [spark] Karl-WangSK removed a comment on pull request #29360: [SPARK-32542][SQL] Add an optimizer rule to split an Expand into multiple Expands for aggregates

2020-09-09 Thread GitBox
Karl-WangSK removed a comment on pull request #29360: URL: https://github.com/apache/spark/pull/29360#issuecomment-686560052 ready to merge if no other problems @LuciferYang Thanks! This is an automated message from the

[GitHub] [spark] dongjoon-hyun commented on pull request #29701: [SPARK-32836][SS][TESTS] Fix DataStreamReaderWriterSuite to check writer options correctly

2020-09-09 Thread GitBox
dongjoon-hyun commented on pull request #29701: URL: https://github.com/apache/spark/pull/29701#issuecomment-689995915 Ya. It looks okay for now. `branch-2.4` Maven also passed like the following. The `branch-3.0` and `master` Jenkins jobs didn't trigger on this commit yet and those will

[GitHub] [spark] zero323 commented on a change in pull request #29591: [SPARK-32714][PYTHON] Initial pyspark-stubs port.

2020-09-09 Thread GitBox
zero323 commented on a change in pull request #29591: URL: https://github.com/apache/spark/pull/29591#discussion_r486075857 ## File path: python/pyspark/ml/tests/test_algorithms.py ## @@ -333,7 +333,7 @@ def test_linear_regression_with_huber_loss(self): from

[GitHub] [spark] zero323 commented on a change in pull request #29591: [SPARK-32714][PYTHON] Initial pyspark-stubs port.

2020-09-09 Thread GitBox
zero323 commented on a change in pull request #29591: URL: https://github.com/apache/spark/pull/29591#discussion_r486075570 ## File path: python/pyspark/ml/common.pyi ## @@ -0,0 +1,27 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor

[GitHub] [spark] LuciferYang commented on pull request #29638: [SPARK-32687][SQL] Let CostBasedJoinReorder produce relatively deterministic optimization result

2020-09-09 Thread GitBox
LuciferYang commented on pull request #29638: URL: https://github.com/apache/spark/pull/29638#issuecomment-689993064 I will give a simplified fix to resolve `same input has different output with different Scala version ` issue in another pr, close this first.

[GitHub] [spark] LuciferYang closed pull request #29638: [SPARK-32687][SQL] Let CostBasedJoinReorder produce relatively deterministic optimization result

2020-09-09 Thread GitBox
LuciferYang closed pull request #29638: URL: https://github.com/apache/spark/pull/29638 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] maropu commented on a change in pull request #29672: [SPARK-32818][SQL] Make `CONVERT_METASTORE_PARQUET` and `CONVERT_METASTORE_ORC` session level configurable

2020-09-09 Thread GitBox
maropu commented on a change in pull request #29672: URL: https://github.com/apache/spark/pull/29672#discussion_r486071663 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala ## @@ -2560,6 +2560,33 @@ abstract class SQLQuerySuiteBase

[GitHub] [spark] HyukjinKwon commented on pull request #29631: [SPARK-32772][SQL][FOLLOWUP] Remove legacy silent support mode for spark-sql CLI

2020-09-09 Thread GitBox
HyukjinKwon commented on pull request #29631: URL: https://github.com/apache/spark/pull/29631#issuecomment-689990904 Let me just revert this @wangyum. I know https://github.com/apache/spark/pull/29631#discussion_r485356997 is a kind of unlikely scenario but I think the change is still

[GitHub] [spark] KevinSmile commented on pull request #29653: [SPARK-32804][Launcher] Fix run-example command builder bug

2020-09-09 Thread GitBox
KevinSmile commented on pull request #29653: URL: https://github.com/apache/spark/pull/29653#issuecomment-689991002 It seems that `BlockManagerDecommissionIntegrationSuite` is unstable, as it passed in a previous ci but failed this time.

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29705: [SPARK-32819][SQL][3.0] ignoreNullability parameter should be effective recursively

2020-09-09 Thread GitBox
AmplabJenkins removed a comment on pull request #29705: URL: https://github.com/apache/spark/pull/29705#issuecomment-689984706 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] HyukjinKwon closed pull request #29686: [SPARK-32312][SQL][PYTHON][test-java11] Upgrade Apache Arrow to version 1.0.1

2020-09-09 Thread GitBox
HyukjinKwon closed pull request #29686: URL: https://github.com/apache/spark/pull/29686 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] HyukjinKwon commented on pull request #29686: [SPARK-32312][SQL][PYTHON][test-java11] Upgrade Apache Arrow to version 1.0.1

2020-09-09 Thread GitBox
HyukjinKwon commented on pull request #29686: URL: https://github.com/apache/spark/pull/29686#issuecomment-689988364 Merged to master. This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] SparkQA commented on pull request #29705: [SPARK-32819][SQL][3.0] ignoreNullability parameter should be effective recursively

2020-09-09 Thread GitBox
SparkQA commented on pull request #29705: URL: https://github.com/apache/spark/pull/29705#issuecomment-689987245 **[Test build #128488 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128488/testReport)** for PR 29705 at commit

[GitHub] [spark] AngersZhuuuu commented on pull request #29692: [SPARK-32830][SQL] Optimize Skewed BroadcastNestedLoopJoin with AE

2020-09-09 Thread GitBox
AngersZh commented on pull request #29692: URL: https://github.com/apache/spark/pull/29692#issuecomment-689986896 ping @cloud-fan @hvanhovell @maryannxue This is an automated message from the Apache Git Service. To

[GitHub] [spark] viirya commented on pull request #29701: [SPARK-32836][SS][TESTS] Fix DataStreamReaderWriterSuite to check writer options correctly

2020-09-09 Thread GitBox
viirya commented on pull request #29701: URL: https://github.com/apache/spark/pull/29701#issuecomment-689985664 Ok, I think it is fine as GitHub Actions passed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins commented on pull request #29705: [SPARK-32819][SQL][3.0] ignoreNullability parameter should be effective recursively

2020-09-09 Thread GitBox
AmplabJenkins commented on pull request #29705: URL: https://github.com/apache/spark/pull/29705#issuecomment-689984706 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29701: [SPARK-32836][SS][TESTS] Fix DataStreamReaderWriterSuite to check writer options correctly

2020-09-09 Thread GitBox
dongjoon-hyun edited a comment on pull request #29701: URL: https://github.com/apache/spark/pull/29701#issuecomment-689984279 I'll monitor Jenkins status. BTW, the current GitHub Action result is the following. ![Screen Shot 2020-09-09 at 10 05 58

[GitHub] [spark] dongjoon-hyun commented on pull request #29701: [SPARK-32836][SS][TESTS] Fix DataStreamReaderWriterSuite to check writer options correctly

2020-09-09 Thread GitBox
dongjoon-hyun commented on pull request #29701: URL: https://github.com/apache/spark/pull/29701#issuecomment-689984279 I'll monitor Jenkins status. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] viirya opened a new pull request #29705: [SPARK-32819][SQL][3.0] ignoreNullability parameter should be effective recursively

2020-09-09 Thread GitBox
viirya opened a new pull request #29705: URL: https://github.com/apache/spark/pull/29705 ### What changes were proposed in this pull request? This patch proposes to check `ignoreNullability` parameter recursively in `equalsStructurally` method. This backports #29698 to

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29703: [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant available in PyPI

2020-09-09 Thread GitBox
HyukjinKwon commented on a change in pull request #29703: URL: https://github.com/apache/spark/pull/29703#discussion_r486064585 ## File path: dev/create-release/release-build.sh ## @@ -275,6 +275,8 @@ if [[ "$1" == "package" ]]; then # In dry run mode, only build the first

[GitHub] [spark] mingjialiu commented on a change in pull request #29564: [WIP][SPARK-32708] Query optimization fails to reuse exchange with DataSourceV2

2020-09-09 Thread GitBox
mingjialiu commented on a change in pull request #29564: URL: https://github.com/apache/spark/pull/29564#discussion_r486063202 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala ## @@ -52,6 +53,17 @@ case class

[GitHub] [spark] viirya commented on a change in pull request #29565: [SPARK-24994][SQL] Add UnwrapCastInBinaryComparison optimizer to simplify literal types

2020-09-09 Thread GitBox
viirya commented on a change in pull request #29565: URL: https://github.com/apache/spark/pull/29565#discussion_r486063074 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala ## @@ -0,0 +1,204 @@ +/* + *

[GitHub] [spark] cloud-fan commented on a change in pull request #29565: [SPARK-24994][SQL] Add UnwrapCastInBinaryComparison optimizer to simplify literal types

2020-09-09 Thread GitBox
cloud-fan commented on a change in pull request #29565: URL: https://github.com/apache/spark/pull/29565#discussion_r486063585 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala ## @@ -0,0 +1,204 @@ +/* + *

[GitHub] [spark] dongjoon-hyun commented on pull request #29701: [SPARK-32836][SS][TESTS] Fix DataStreamReaderWriterSuite to check writer options correctly

2020-09-09 Thread GitBox
dongjoon-hyun commented on pull request #29701: URL: https://github.com/apache/spark/pull/29701#issuecomment-689981755 I verified again locally. It works. Given the error message from Jenkins, the map seems to be overwritten to the read options back again by the streaming query. If

[GitHub] [spark] mingjialiu commented on a change in pull request #29564: [WIP][SPARK-32708] Query optimization fails to reuse exchange with DataSourceV2

2020-09-09 Thread GitBox
mingjialiu commented on a change in pull request #29564: URL: https://github.com/apache/spark/pull/29564#discussion_r486063202 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala ## @@ -52,6 +53,17 @@ case class

[GitHub] [spark] cloud-fan commented on a change in pull request #29565: [SPARK-24994][SQL] Add UnwrapCastInBinaryComparison optimizer to simplify literal types

2020-09-09 Thread GitBox
cloud-fan commented on a change in pull request #29565: URL: https://github.com/apache/spark/pull/29565#discussion_r486063309 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala ## @@ -0,0 +1,204 @@ +/* + *

[GitHub] [spark] mingjialiu commented on a change in pull request #29564: [WIP][SPARK-32708] Query optimization fails to reuse exchange with DataSourceV2

2020-09-09 Thread GitBox
mingjialiu commented on a change in pull request #29564: URL: https://github.com/apache/spark/pull/29564#discussion_r486063202 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala ## @@ -52,6 +53,17 @@ case class

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29703: [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant available in PyPI

2020-09-09 Thread GitBox
HyukjinKwon commented on a change in pull request #29703: URL: https://github.com/apache/spark/pull/29703#discussion_r486062434 ## File path: python/pyspark/install.py ## @@ -0,0 +1,170 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor

[GitHub] [spark] cloud-fan closed pull request #29687: [SPARK-32826][SQL] Set the right column size for the null type in SparkGetColumnsOperation

2020-09-09 Thread GitBox
cloud-fan closed pull request #29687: URL: https://github.com/apache/spark/pull/29687 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] cloud-fan commented on pull request #29687: [SPARK-32826][SQL] Set the right column size for the null type in SparkGetColumnsOperation

2020-09-09 Thread GitBox
cloud-fan commented on pull request #29687: URL: https://github.com/apache/spark/pull/29687#issuecomment-689980129 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] cloud-fan commented on a change in pull request #29564: [WIP][SPARK-32708] Query optimization fails to reuse exchange with DataSourceV2

2020-09-09 Thread GitBox
cloud-fan commented on a change in pull request #29564: URL: https://github.com/apache/spark/pull/29564#discussion_r486061776 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala ## @@ -52,6 +53,17 @@ case class

[GitHub] [spark] dongjoon-hyun commented on pull request #29701: [SPARK-32836][SS][TESTS] Fix DataStreamReaderWriterSuite to check writer options correctly

2020-09-09 Thread GitBox
dongjoon-hyun commented on pull request #29701: URL: https://github.com/apache/spark/pull/29701#issuecomment-689979351 BTW, `GitHub Action` also passed, so I merged this. This is an automated message from the Apache Git

[GitHub] [spark] dongjoon-hyun commented on pull request #29701: [SPARK-32836][SS][TESTS] Fix DataStreamReaderWriterSuite to check writer options correctly

2020-09-09 Thread GitBox
dongjoon-hyun commented on pull request #29701: URL: https://github.com/apache/spark/pull/29701#issuecomment-689978839 Ur, let me check again. I checked in IntelliJ and run `build/sbt` streaming tests locally. This is an

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29703: [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant available in PyPI

2020-09-09 Thread GitBox
dongjoon-hyun edited a comment on pull request #29703: URL: https://github.com/apache/spark/pull/29703#issuecomment-689978218 Thank you for pinging me, @HyukjinKwon . cc @gatorsmile . Do you have any opinion on Hive 1.2 at Apache Spark 3.1.0?

[GitHub] [spark] dongjoon-hyun commented on pull request #29703: [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant available in PyPI

2020-09-09 Thread GitBox
dongjoon-hyun commented on pull request #29703: URL: https://github.com/apache/spark/pull/29703#issuecomment-689978218 Thank you for pinging me, @HyukjinKwon . cc @gatorsmile . Do you have any opinion on Hive 1.2?

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29704: [SPARK-32058][BUILD][FOLLOW-UP] Use Apache Hadoop 3.2.0 for PyPI and CRAN

2020-09-09 Thread GitBox
AmplabJenkins removed a comment on pull request #29704: URL: https://github.com/apache/spark/pull/29704#issuecomment-689977493 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29692: [WIP][SPARK-32830][SQL] Optimize Skewed BroadcastNestedLoopJoin with AE

2020-09-09 Thread GitBox
AmplabJenkins removed a comment on pull request #29692: URL: https://github.com/apache/spark/pull/29692#issuecomment-689977523 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] viirya commented on pull request #29701: [SPARK-32836][SS][TESTS] Fix DataStreamReaderWriterSuite to check writer options correctly

2020-09-09 Thread GitBox
viirya commented on pull request #29701: URL: https://github.com/apache/spark/pull/29701#issuecomment-689977450 @dongjoon-hyun The test seems not passed? This is an automated message from the Apache Git Service. To respond

[GitHub] [spark] AmplabJenkins commented on pull request #29704: [SPARK-32058][BUILD][FOLLOW-UP] Use Apache Hadoop 3.2.0 for PyPI and CRAN

2020-09-09 Thread GitBox
AmplabJenkins commented on pull request #29704: URL: https://github.com/apache/spark/pull/29704#issuecomment-689977493 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #29692: [WIP][SPARK-32830][SQL] Optimize Skewed BroadcastNestedLoopJoin with AE

2020-09-09 Thread GitBox
AmplabJenkins commented on pull request #29692: URL: https://github.com/apache/spark/pull/29692#issuecomment-689977523 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29692: [WIP][SPARK-32830][SQL] Optimize Skewed BroadcastNestedLoopJoin with AE

2020-09-09 Thread GitBox
SparkQA commented on pull request #29692: URL: https://github.com/apache/spark/pull/29692#issuecomment-689977122 **[Test build #128487 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128487/testReport)** for PR 29692 at commit

[GitHub] [spark] tdas closed pull request #29700: [SPARK-32794][SS] Fixed rare corner case error in micro-batch engine with some stateful queries + no-data-batches + V1 sources

2020-09-09 Thread GitBox
tdas closed pull request #29700: URL: https://github.com/apache/spark/pull/29700 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] SparkQA commented on pull request #29704: [SPARK-32058][BUILD][FOLLOW-UP] Use Apache Hadoop 3.2.0 for PyPI and CRAN

2020-09-09 Thread GitBox
SparkQA commented on pull request #29704: URL: https://github.com/apache/spark/pull/29704#issuecomment-689977082 **[Test build #128486 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128486/testReport)** for PR 29704 at commit

[GitHub] [spark] HyukjinKwon opened a new pull request #29704: [SPARK-32058][BUILD][FOLLOW-UP] Use Apache Hadoop 3.2.0 for PyPI and CRAN

2020-09-09 Thread GitBox
HyukjinKwon opened a new pull request #29704: URL: https://github.com/apache/spark/pull/29704 ### What changes were proposed in this pull request? This is a followup of https://github.com/apache/spark/pull/28897. PyPI and CRAN did not change because of the concern about selecting

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29701: [SPARK-32836][SS][TESTS] Fix DataStreamReaderWriterSuite to check writer options correctly

2020-09-09 Thread GitBox
AmplabJenkins removed a comment on pull request #29701: URL: https://github.com/apache/spark/pull/29701#issuecomment-689975508 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] HyukjinKwon commented on pull request #29704: [SPARK-32058][BUILD][FOLLOW-UP] Use Apache Hadoop 3.2.0 for PyPI and CRAN

2020-09-09 Thread GitBox
HyukjinKwon commented on pull request #29704: URL: https://github.com/apache/spark/pull/29704#issuecomment-689975962 cc @dongjoon-hyun, can you take a look when you're available? This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29701: [SPARK-32836][SS][TESTS] Fix DataStreamReaderWriterSuite to check writer options correctly

2020-09-09 Thread GitBox
AmplabJenkins removed a comment on pull request #29701: URL: https://github.com/apache/spark/pull/29701#issuecomment-689975504 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins commented on pull request #29701: [SPARK-32836][SS][TESTS] Fix DataStreamReaderWriterSuite to check writer options correctly

2020-09-09 Thread GitBox
AmplabJenkins commented on pull request #29701: URL: https://github.com/apache/spark/pull/29701#issuecomment-689975504 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29703: [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant available in PyPI

2020-09-09 Thread GitBox
AmplabJenkins removed a comment on pull request #29703: URL: https://github.com/apache/spark/pull/29703#issuecomment-689975255 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29703: [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant available in PyPI

2020-09-09 Thread GitBox
AmplabJenkins commented on pull request #29703: URL: https://github.com/apache/spark/pull/29703#issuecomment-689975255 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29703: [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant available in PyPI

2020-09-09 Thread GitBox
SparkQA commented on pull request #29703: URL: https://github.com/apache/spark/pull/29703#issuecomment-689975060 **[Test build #128485 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128485/testReport)** for PR 29703 at commit

[GitHub] [spark] SparkQA commented on pull request #29701: [SPARK-32836][SS][TESTS] Fix DataStreamReaderWriterSuite to check writer options correctly

2020-09-09 Thread GitBox
SparkQA commented on pull request #29701: URL: https://github.com/apache/spark/pull/29701#issuecomment-689975081 **[Test build #128473 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128473/testReport)** for PR 29701 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #29701: [SPARK-32836][SS][TESTS] Fix DataStreamReaderWriterSuite to check writer options correctly

2020-09-09 Thread GitBox
SparkQA removed a comment on pull request #29701: URL: https://github.com/apache/spark/pull/29701#issuecomment-689902002 **[Test build #128473 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128473/testReport)** for PR 29701 at commit

[GitHub] [spark] HyukjinKwon commented on pull request #29703: [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant available in PyPI

2020-09-09 Thread GitBox
HyukjinKwon commented on pull request #29703: URL: https://github.com/apache/spark/pull/29703#issuecomment-689973644 cc @srowen, @dongjoon-hyun, @holdenk, @BryanCutler, @viirya, @ueshin FYI This is an automated message from

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29703: [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant available in PyPI

2020-09-09 Thread GitBox
HyukjinKwon commented on a change in pull request #29703: URL: https://github.com/apache/spark/pull/29703#discussion_r486056702 ## File path: python/pyspark/install.py ## @@ -0,0 +1,170 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor

[GitHub] [spark] AmplabJenkins commented on pull request #29703: [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant available in PyPI

2020-09-09 Thread GitBox
AmplabJenkins commented on pull request #29703: URL: https://github.com/apache/spark/pull/29703#issuecomment-689973252 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29703: [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant available in PyPI

2020-09-09 Thread GitBox
AmplabJenkins removed a comment on pull request #29703: URL: https://github.com/apache/spark/pull/29703#issuecomment-689973252 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #29703: [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant available in PyPI

2020-09-09 Thread GitBox
SparkQA commented on pull request #29703: URL: https://github.com/apache/spark/pull/29703#issuecomment-689972908 **[Test build #128484 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128484/testReport)** for PR 29703 at commit

[GitHub] [spark] HyukjinKwon opened a new pull request #29703: [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant available in PyPI

2020-09-09 Thread GitBox
HyukjinKwon opened a new pull request #29703: URL: https://github.com/apache/spark/pull/29703 ### What changes were proposed in this pull request? This PR proposes to add a way to select Hadoop and Hive versions in pip installation. Users can select Hive or Hadoop versions as

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29702: [SPARK-32832][SS] Use CaseInsensitiveMap for DataStreamReader/Writer options

2020-09-09 Thread GitBox
AmplabJenkins removed a comment on pull request #29702: URL: https://github.com/apache/spark/pull/29702#issuecomment-689969062 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29702: [SPARK-32832][SS] Use CaseInsensitiveMap for DataStreamReader/Writer options

2020-09-09 Thread GitBox
AmplabJenkins commented on pull request #29702: URL: https://github.com/apache/spark/pull/29702#issuecomment-689969062 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29702: [SPARK-32832][SS] Use CaseInsensitiveMap for DataStreamReader/Writer options

2020-09-09 Thread GitBox
SparkQA commented on pull request #29702: URL: https://github.com/apache/spark/pull/29702#issuecomment-689968671 **[Test build #128483 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128483/testReport)** for PR 29702 at commit

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29702: [SPARK-32832][SS] Use CaseInsensitiveMap for DataStreamReader/Writer options

2020-09-09 Thread GitBox
dongjoon-hyun commented on a change in pull request #29702: URL: https://github.com/apache/spark/pull/29702#discussion_r486051135 ## File path: sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamReader.scala ## @@ -535,5 +536,5 @@ final class DataStreamReader

[GitHub] [spark] gengliangwang commented on pull request #29564: [WIP][SPARK-32708] Query optimization fails to reuse exchange with DataSourceV2

2020-09-09 Thread GitBox
gengliangwang commented on pull request #29564: URL: https://github.com/apache/spark/pull/29564#issuecomment-689964565 @mingjialiu the new fix looks more reasonable. Could you add test case for the changes? This is an

[GitHub] [spark] mingjialiu commented on pull request #29564: [WIP][SPARK-32708] Query optimization fails to reuse exchange with DataSourceV2

2020-09-09 Thread GitBox
mingjialiu commented on pull request #29564: URL: https://github.com/apache/spark/pull/29564#issuecomment-689963685 > In Branch 3.0, there is a mixed-in trait `SupportsPushDownFilters` which is introduced by #19136 and #19424 . > > However, if we are going to cherry-pick the PRs

[GitHub] [spark] mingjialiu commented on pull request #29564: [WIP][SPARK-32708] Query optimization fails to reuse exchange with DataSourceV2

2020-09-09 Thread GitBox
mingjialiu commented on pull request #29564: URL: https://github.com/apache/spark/pull/29564#issuecomment-689963482 > If exchange reuse is broken, it means plan equality is broken somewhere. I think `Seq[Expression]` is OK as long as we canonicalize it before comparing it.

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29461: [SPARK-32456][SS][FOLLOWUP] Update doc to note about using SQL statement with streaming Dataset

2020-09-09 Thread GitBox
AmplabJenkins removed a comment on pull request #29461: URL: https://github.com/apache/spark/pull/29461#issuecomment-689962750 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29461: [SPARK-32456][SS][FOLLOWUP] Update doc to note about using SQL statement with streaming Dataset

2020-09-09 Thread GitBox
AmplabJenkins commented on pull request #29461: URL: https://github.com/apache/spark/pull/29461#issuecomment-689962750 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29461: [SPARK-32456][SS][FOLLOWUP] Update doc to note about using SQL statement with streaming Dataset

2020-09-09 Thread GitBox
SparkQA commented on pull request #29461: URL: https://github.com/apache/spark/pull/29461#issuecomment-689962431 **[Test build #128482 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128482/testReport)** for PR 29461 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28618: [SPARK-31801][API][SHUFFLE] Register map output metadata

2020-09-09 Thread GitBox
AmplabJenkins commented on pull request #28618: URL: https://github.com/apache/spark/pull/28618#issuecomment-689960847 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28618: [SPARK-31801][API][SHUFFLE] Register map output metadata

2020-09-09 Thread GitBox
AmplabJenkins removed a comment on pull request #28618: URL: https://github.com/apache/spark/pull/28618#issuecomment-689960847 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29564: [WIP][SPARK-32708] Query optimization fails to reuse exchange with DataSourceV2

2020-09-09 Thread GitBox
AmplabJenkins removed a comment on pull request #29564: URL: https://github.com/apache/spark/pull/29564#issuecomment-689960789 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29564: [WIP][SPARK-32708] Query optimization fails to reuse exchange with DataSourceV2

2020-09-09 Thread GitBox
AmplabJenkins commented on pull request #29564: URL: https://github.com/apache/spark/pull/29564#issuecomment-689960789 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29564: [WIP][SPARK-32708] Query optimization fails to reuse exchange with DataSourceV2

2020-09-09 Thread GitBox
SparkQA commented on pull request #29564: URL: https://github.com/apache/spark/pull/29564#issuecomment-689960384 **[Test build #128481 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128481/testReport)** for PR 29564 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #28618: [SPARK-31801][API][SHUFFLE] Register map output metadata

2020-09-09 Thread GitBox
SparkQA removed a comment on pull request #28618: URL: https://github.com/apache/spark/pull/28618#issuecomment-689906398 **[Test build #128474 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128474/testReport)** for PR 28618 at commit

[GitHub] [spark] SparkQA commented on pull request #28618: [SPARK-31801][API][SHUFFLE] Register map output metadata

2020-09-09 Thread GitBox
SparkQA commented on pull request #28618: URL: https://github.com/apache/spark/pull/28618#issuecomment-689960126 **[Test build #128474 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128474/testReport)** for PR 28618 at commit

[GitHub] [spark] cloud-fan commented on a change in pull request #29565: [SPARK-24994][SQL] Add UnwrapCastInBinaryComparison optimizer to simplify literal types

2020-09-09 Thread GitBox
cloud-fan commented on a change in pull request #29565: URL: https://github.com/apache/spark/pull/29565#discussion_r486044404 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala ## @@ -0,0 +1,204 @@ +/* + *

[GitHub] [spark] cloud-fan commented on a change in pull request #29565: [SPARK-24994][SQL] Add UnwrapCastInBinaryComparison optimizer to simplify literal types

2020-09-09 Thread GitBox
cloud-fan commented on a change in pull request #29565: URL: https://github.com/apache/spark/pull/29565#discussion_r486044204 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala ## @@ -0,0 +1,204 @@ +/* + *

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29700: [SPARK-32794][SS] Fixed rare corner case error in micro-batch engine with some stateful queries + no-data-batches + V1 sources

2020-09-09 Thread GitBox
AmplabJenkins removed a comment on pull request #29700: URL: https://github.com/apache/spark/pull/29700#issuecomment-689957303 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29700: [SPARK-32794][SS] Fixed rare corner case error in micro-batch engine with some stateful queries + no-data-batches + V1 sources

2020-09-09 Thread GitBox
AmplabJenkins commented on pull request #29700: URL: https://github.com/apache/spark/pull/29700#issuecomment-689957303 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] viirya commented on pull request #29698: [SPARK-32819][SQL] ignoreNullability parameter should be effective recursively

2020-09-09 Thread GitBox
viirya commented on pull request #29698: URL: https://github.com/apache/spark/pull/29698#issuecomment-689956894 Thanks! I will open a PR for 3.0. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] SparkQA removed a comment on pull request #29700: [SPARK-32794][SS] Fixed rare corner case error in micro-batch engine with some stateful queries + no-data-batches + V1 sources

2020-09-09 Thread GitBox
SparkQA removed a comment on pull request #29700: URL: https://github.com/apache/spark/pull/29700#issuecomment-689888682 **[Test build #128472 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128472/testReport)** for PR 29700 at commit

[GitHub] [spark] SparkQA commented on pull request #29700: [SPARK-32794][SS] Fixed rare corner case error in micro-batch engine with some stateful queries + no-data-batches + V1 sources

2020-09-09 Thread GitBox
SparkQA commented on pull request #29700: URL: https://github.com/apache/spark/pull/29700#issuecomment-689956289 **[Test build #128472 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128472/testReport)** for PR 29700 at commit

[GitHub] [spark] yaooqinn commented on pull request #29687: [SPARK-32826][SQL] Set the right column size for the null type in SparkGetColumnsOperation

2020-09-09 Thread GitBox
yaooqinn commented on pull request #29687: URL: https://github.com/apache/spark/pull/29687#issuecomment-689955329 updated, thanks for suggestions This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] cloud-fan commented on pull request #29687: [SPARK-32826][SQL] Add test case for get null columns using SparkGetColumnsOperation

2020-09-09 Thread GitBox
cloud-fan commented on pull request #29687: URL: https://github.com/apache/spark/pull/29687#issuecomment-689952870 please also update the PR description to mention the non-test change. This is an automated message from the

[GitHub] [spark] cloud-fan commented on a change in pull request #29687: [SPARK-32826][SQL][TEST] Add test case for get null columns using SparkGetColumnsOperation

2020-09-09 Thread GitBox
cloud-fan commented on a change in pull request #29687: URL: https://github.com/apache/spark/pull/29687#discussion_r486038353 ## File path: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkGetColumnsOperation.scala ## @@ -130,7 +130,8 @@

[GitHub] [spark] yaooqinn commented on a change in pull request #29687: [SPARK-32826][SQL][TEST] Add test case for get null columns using SparkGetColumnsOperation

2020-09-09 Thread GitBox
yaooqinn commented on a change in pull request #29687: URL: https://github.com/apache/spark/pull/29687#discussion_r486038499 ## File path: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkGetColumnsOperation.scala ## @@ -130,7 +130,8 @@

[GitHub] [spark] cloud-fan commented on a change in pull request #29461: [SPARK-32456][SS][FOLLOWUP] Update doc to note about using SQL statement with streaming Dataset

2020-09-09 Thread GitBox
cloud-fan commented on a change in pull request #29461: URL: https://github.com/apache/spark/pull/29461#discussion_r486036434 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -3131,8 +3131,12 @@ class Dataset[T] private[sql]( * Returns a new

[GitHub] [spark] cloud-fan commented on a change in pull request #29461: [SPARK-32456][SS][FOLLOWUP] Update doc to note about using SQL statement with streaming Dataset

2020-09-09 Thread GitBox
cloud-fan commented on a change in pull request #29461: URL: https://github.com/apache/spark/pull/29461#discussion_r486036511 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -3131,8 +3131,12 @@ class Dataset[T] private[sql]( * Returns a new

  1   2   3   4   5   6   >