[GitHub] [spark] SparkQA removed a comment on pull request #31316: [SPARK-33599][SQL][FOLLOWUP] Group exception messages in catalyst/analysis

2021-02-09 Thread GitBox


SparkQA removed a comment on pull request #31316:
URL: https://github.com/apache/spark/pull/31316#issuecomment-776512053


   **[Test build #135095 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135095/testReport)**
 for PR 31316 at commit 
[`be15d67`](https://github.com/apache/spark/commit/be15d6766e37536e28111349fc85eddbcac020cb).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31316: [SPARK-33599][SQL][FOLLOWUP] Group exception messages in catalyst/analysis

2021-02-09 Thread GitBox


SparkQA commented on pull request #31316:
URL: https://github.com/apache/spark/pull/31316#issuecomment-776514201


   **[Test build #135095 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135095/testReport)**
 for PR 31316 at commit 
[`be15d67`](https://github.com/apache/spark/commit/be15d6766e37536e28111349fc85eddbcac020cb).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31542: [SPARK-34414][SQL] OptimizeMetadataOnlyQuery should only apply for deterministic filters

2021-02-09 Thread GitBox


SparkQA commented on pull request #31542:
URL: https://github.com/apache/spark/pull/31542#issuecomment-776512107


   **[Test build #135093 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135093/testReport)**
 for PR 31542 at commit 
[`050398b`](https://github.com/apache/spark/commit/050398bb49552721ff4aea51af6e1463f3fe2075).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31316: [SPARK-33599][SQL][FOLLOWUP] Group exception messages in catalyst/analysis

2021-02-09 Thread GitBox


SparkQA commented on pull request #31316:
URL: https://github.com/apache/spark/pull/31316#issuecomment-776512053


   **[Test build #135095 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135095/testReport)**
 for PR 31316 at commit 
[`be15d67`](https://github.com/apache/spark/commit/be15d6766e37536e28111349fc85eddbcac020cb).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31541: Revert "[SPARK-34209][SQL] Delegate table name validation to the session catalog"

2021-02-09 Thread GitBox


SparkQA commented on pull request #31541:
URL: https://github.com/apache/spark/pull/31541#issuecomment-776511932


   **[Test build #135094 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135094/testReport)**
 for PR 31541 at commit 
[`fde78fc`](https://github.com/apache/spark/commit/fde78fcb058556c6393295f183f36a0e0af15ff3).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #31494: [SPARK-34380][SQL] Support ifExists for ALTER TABLE ... UNSET TBLPROPERTIES for v2 command

2021-02-09 Thread GitBox


HyukjinKwon commented on a change in pull request #31494:
URL: https://github.com/apache/spark/pull/31494#discussion_r573502020



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTableTests.scala
##
@@ -1141,6 +1141,36 @@ trait AlterTableTests extends SharedSparkSession {
 }
   }
 
+  test("AlterTable: remove nonexistent table property") {

Review comment:
   Shall we add a JIRA prefix?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31442: [SPARK-34333][SQL] Fix PostgresDialect to handle money types properly

2021-02-09 Thread GitBox


AmplabJenkins removed a comment on pull request #31442:
URL: https://github.com/apache/spark/pull/31442#issuecomment-776503691


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135091/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work

2021-02-09 Thread GitBox


AmplabJenkins removed a comment on pull request #31337:
URL: https://github.com/apache/spark/pull/31337#issuecomment-776503690


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135088/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gaborgsomogyi commented on pull request #31384: [SPARK-31816][SQL][DOCS] Added high level description about JDBC connection providers for users/developers

2021-02-09 Thread GitBox


gaborgsomogyi commented on pull request #31384:
URL: https://github.com/apache/spark/pull/31384#issuecomment-776504404


   @HyukjinKwon @maropu @HeartSaVioR Thank you for taking care!
   With this last PR I consider the feature done from my side.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31442: [SPARK-34333][SQL] Fix PostgresDialect to handle money types properly

2021-02-09 Thread GitBox


AmplabJenkins commented on pull request #31442:
URL: https://github.com/apache/spark/pull/31442#issuecomment-776503691


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135091/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work

2021-02-09 Thread GitBox


AmplabJenkins commented on pull request #31337:
URL: https://github.com/apache/spark/pull/31337#issuecomment-776503690


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135088/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #31542: [MINOR][SQL] OptimizeMetadataOnlyQuery should only apply for deterministic filters

2021-02-09 Thread GitBox


HyukjinKwon commented on pull request #31542:
URL: https://github.com/apache/spark/pull/31542#issuecomment-776503505


   @yeshengm do you mind filing a JIRA please?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] yeshengm opened a new pull request #31542: [MINOR][SQL] OptimizeMetadataOnlyQuery should only apply for deterministic filters

2021-02-09 Thread GitBox


yeshengm opened a new pull request #31542:
URL: https://github.com/apache/spark/pull/31542


   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   Similar to FileSourcePartitionPruning, OptimizeMetadataOnlyQuery should only 
apply for deterministic filters. If filters are non-deterministic, they have to 
be evaluated against partitions separately.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Existing UTs.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work

2021-02-09 Thread GitBox


SparkQA removed a comment on pull request #31337:
URL: https://github.com/apache/spark/pull/31337#issuecomment-776412166


   **[Test build #135088 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135088/testReport)**
 for PR 31337 at commit 
[`c6fb5a8`](https://github.com/apache/spark/commit/c6fb5a8a696af79c5ccacd71b8c3a711093b415d).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work

2021-02-09 Thread GitBox


SparkQA commented on pull request #31337:
URL: https://github.com/apache/spark/pull/31337#issuecomment-776497711


   **[Test build #135088 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135088/testReport)**
 for PR 31337 at commit 
[`c6fb5a8`](https://github.com/apache/spark/commit/c6fb5a8a696af79c5ccacd71b8c3a711093b415d).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #31442: [SPARK-34333][SQL] Fix PostgresDialect to handle money types properly

2021-02-09 Thread GitBox


SparkQA removed a comment on pull request #31442:
URL: https://github.com/apache/spark/pull/31442#issuecomment-776430570


   **[Test build #135091 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135091/testReport)**
 for PR 31442 at commit 
[`b838218`](https://github.com/apache/spark/commit/b83821806c87ba6d710efe604a7c9d270af9617b).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31442: [SPARK-34333][SQL] Fix PostgresDialect to handle money types properly

2021-02-09 Thread GitBox


SparkQA commented on pull request #31442:
URL: https://github.com/apache/spark/pull/31442#issuecomment-776491691


   **[Test build #135091 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135091/testReport)**
 for PR 31442 at commit 
[`b838218`](https://github.com/apache/spark/commit/b83821806c87ba6d710efe604a7c9d270af9617b).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #31541: Revert "[SPARK-34209][SQL] Delegate table name validation to the session catalog"

2021-02-09 Thread GitBox


HyukjinKwon commented on pull request #31541:
URL: https://github.com/apache/spark/pull/31541#issuecomment-776486423


   @holdenk, @cloud-fan, @imback82, @dongjoon-hyun, @maropu, I opened a PR to 
discuss the revert of https://github.com/apache/spark/pull/31427. Could you 
guys take a look please?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon opened a new pull request #31541: Revert "[SPARK-34209][SQL] Delegate table name validation to the session catalog"

2021-02-09 Thread GitBox


HyukjinKwon opened a new pull request #31541:
URL: https://github.com/apache/spark/pull/31541


   ### What changes were proposed in this pull request?
   
   This reverts commit cf7a13c363ef5d56556c9d70e7811bf6a40de55f.
   
   ### Why are the changes needed?
   
   To fix the regression in the error message, and not obvious what was the 
gain.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No, this is a revert.
   
   ### How was this patch tested?
   
   Reverted test cases should verify this revert.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on pull request #31499: [SPARK-31891][SQL] Support `MSCK REPAIR TABLE .. [ADD|DROP|SYNC] PARTITIONS`

2021-02-09 Thread GitBox


MaxGekk commented on pull request #31499:
URL: https://github.com/apache/spark/pull/31499#issuecomment-776473007


   @dongjoon-hyun @viirya Are you ok with the changes?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zzcclp commented on a change in pull request #31404: [SPARK-34283][SQL] Combines all adjacent 'Union' operators into a single 'Union' when using 'Dataset.union.distinct.union.distinct

2021-02-09 Thread GitBox


zzcclp commented on a change in pull request #31404:
URL: https://github.com/apache/spark/pull/31404#discussion_r573482876



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##
@@ -1007,6 +1007,8 @@ object CombineUnions extends Rule[LogicalPlan] {
   def apply(plan: LogicalPlan): LogicalPlan = plan transformDown {
 case u: Union => flattenUnion(u, false)
 case Distinct(u: Union) => Distinct(flattenUnion(u, true))
+case Deduplicate(keys: Seq[Attribute], u: Union) if keys == u.output =>

Review comment:
   I have added a test case for the different order case, the condition 
work fine, please review, @maropu @cloud-fan , thanks.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zzcclp commented on pull request #31404: [SPARK-34283][SQL] Combines all adjacent 'Union' operators into a single 'Union' when using 'Dataset.union.distinct.union.distinct'

2021-02-09 Thread GitBox


zzcclp commented on pull request #31404:
URL: https://github.com/apache/spark/pull/31404#issuecomment-776482363


   > Could you update the PR description along with the current approach?
   
   Done



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work

2021-02-09 Thread GitBox


beliefer commented on pull request #31337:
URL: https://github.com/apache/spark/pull/31337#issuecomment-776481714


   @cloud-fan Thanks for your work! @maropu @viirya Thanks for your review!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan closed pull request #31529: [SPARK-34404][SQL] Add new Avro datasource options to control datetime rebasing in read

2021-02-09 Thread GitBox


cloud-fan closed pull request #31529:
URL: https://github.com/apache/spark/pull/31529


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #31529: [SPARK-34404][SQL] Add new Avro datasource options to control datetime rebasing in read

2021-02-09 Thread GitBox


cloud-fan commented on pull request #31529:
URL: https://github.com/apache/spark/pull/31529#issuecomment-776476030


   thanks, merging to master!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan closed pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work

2021-02-09 Thread GitBox


cloud-fan closed pull request #31337:
URL: https://github.com/apache/spark/pull/31337


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work

2021-02-09 Thread GitBox


cloud-fan commented on pull request #31337:
URL: https://github.com/apache/spark/pull/31337#issuecomment-776477090


   thanks, merging to master!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-09 Thread GitBox


AmplabJenkins removed a comment on pull request #31468:
URL: https://github.com/apache/spark/pull/31468#issuecomment-776476510


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39674/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31442: [SPARK-34333][SQL] Fix PostgresDialect to handle money types properly

2021-02-09 Thread GitBox


AmplabJenkins removed a comment on pull request #31442:
URL: https://github.com/apache/spark/pull/31442#issuecomment-776476511


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39673/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work

2021-02-09 Thread GitBox


AmplabJenkins removed a comment on pull request #31337:
URL: https://github.com/apache/spark/pull/31337#issuecomment-776476508


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39671/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work

2021-02-09 Thread GitBox


AmplabJenkins commented on pull request #31337:
URL: https://github.com/apache/spark/pull/31337#issuecomment-776476508


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39671/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-09 Thread GitBox


AmplabJenkins commented on pull request #31468:
URL: https://github.com/apache/spark/pull/31468#issuecomment-776476510


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39674/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31442: [SPARK-34333][SQL] Fix PostgresDialect to handle money types properly

2021-02-09 Thread GitBox


AmplabJenkins commented on pull request #31442:
URL: https://github.com/apache/spark/pull/31442#issuecomment-776476511


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39673/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LuciferYang commented on pull request #31528: [SPARK-34403][SQL]Remove dependency to commons-httpclient

2021-02-09 Thread GitBox


LuciferYang commented on pull request #31528:
URL: https://github.com/apache/spark/pull/31528#issuecomment-776475281


   > hive-exec need this dependency.
   
   Will this class actually be loaded by the JVM? Need to wait hive upgrades 
this dependency first?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #31538: [SPARK-34412][SQL] RemoveNoopOperators should only remove trivial projects

2021-02-09 Thread GitBox


dongjoon-hyun commented on a change in pull request #31538:
URL: https://github.com/apache/spark/pull/31538#discussion_r573472762



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RemoveNoopOperatorsSuite.scala
##
@@ -44,6 +45,22 @@ class RemoveNoopOperatorsSuite extends PlanTest {
 comparePlans(optimized, testRelation)
   }
 
+  test("Do not remove projections with non-attribute expressions that reuse 
expr ids") {

Review comment:
   It seems that all newly added UTs fails with the following exception.
   ```
   org.apache.spark.sql.catalyst.errors.package$TreeNodeException:
   The structural integrity of the input plan is broken in 
org.apache.spark.sql.catalyst.analysis.SimpleAnalyzer., tree:
   ```

##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RemoveNoopOperatorsSuite.scala
##
@@ -44,6 +45,22 @@ class RemoveNoopOperatorsSuite extends PlanTest {
 comparePlans(optimized, testRelation)
   }
 
+  test("Do not remove projections with non-attribute expressions that reuse 
expr ids") {

Review comment:
   It seems that all newly added UTs fail with the following exception.
   ```
   org.apache.spark.sql.catalyst.errors.package$TreeNodeException:
   The structural integrity of the input plan is broken in 
org.apache.spark.sql.catalyst.analysis.SimpleAnalyzer., tree:
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-09 Thread GitBox


SparkQA commented on pull request #31468:
URL: https://github.com/apache/spark/pull/31468#issuecomment-776468501


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39674/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on pull request #31529: [SPARK-34404][SQL] Add new Avro datasource options to control datetime rebasing in read

2021-02-09 Thread GitBox


MaxGekk commented on pull request #31529:
URL: https://github.com/apache/spark/pull/31529#issuecomment-776460420


   @cloud-fan @gengliangwang @HyukjinKwon This PR is a companion to 
https://github.com/apache/spark/pull/31489, and should solve the same issues. 
Any concerns about it?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31442: [SPARK-34333][SQL] Fix PostgresDialect to handle money types properly

2021-02-09 Thread GitBox


SparkQA commented on pull request #31442:
URL: https://github.com/apache/spark/pull/31442#issuecomment-776460281


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39673/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on pull request #31499: [SPARK-31891][SQL] Support `MSCK REPAIR TABLE .. [ADD|DROP|SYNC] PARTITIONS`

2021-02-09 Thread GitBox


MaxGekk commented on pull request #31499:
URL: https://github.com/apache/spark/pull/31499#issuecomment-776459632


   @cloud-fan @HyukjinKwon Any objections to the changes?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #31404: [SPARK-34283][SQL] Combines all adjacent 'Union' operators into a single 'Union' when using 'Dataset.union.distinct.union.distinct

2021-02-09 Thread GitBox


maropu commented on a change in pull request #31404:
URL: https://github.com/apache/spark/pull/31404#discussion_r573462571



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##
@@ -1007,6 +1007,8 @@ object CombineUnions extends Rule[LogicalPlan] {
   def apply(plan: LogicalPlan): LogicalPlan = plan transformDown {
 case u: Union => flattenUnion(u, false)
 case Distinct(u: Union) => Distinct(flattenUnion(u, true))
+case Deduplicate(keys: Seq[Attribute], u: Union) if keys == u.output =>

Review comment:
   > If u.output has new expr IDs, then this plan will be unresolved...
   
   Oh, I see. I missed that.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

2021-02-09 Thread GitBox


HyukjinKwon commented on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-776457314


   Hey, I am still lost why we needed to change this. This is a regression in 
the error message, but I am not clear what's the gain here @dongjoon-hyun too.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #31404: [SPARK-34283][SQL] Combines all adjacent 'Union' operators into a single 'Union' when using 'Dataset.union.distinct.union.disti

2021-02-09 Thread GitBox


cloud-fan commented on a change in pull request #31404:
URL: https://github.com/apache/spark/pull/31404#discussion_r573461209



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##
@@ -1007,6 +1007,8 @@ object CombineUnions extends Rule[LogicalPlan] {
   def apply(plan: LogicalPlan): LogicalPlan = plan transformDown {
 case u: Union => flattenUnion(u, false)
 case Distinct(u: Union) => Distinct(flattenUnion(u, true))
+case Deduplicate(keys: Seq[Attribute], u: Union) if keys == u.output =>

Review comment:
   If `u.output`  has new expr IDs, then this plan will be unresolved...





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-09 Thread GitBox


SparkQA commented on pull request #31468:
URL: https://github.com/apache/spark/pull/31468#issuecomment-776453331


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39674/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31442: [SPARK-34333][SQL] Fix PostgresDialect to handle money types properly

2021-02-09 Thread GitBox


SparkQA commented on pull request #31442:
URL: https://github.com/apache/spark/pull/31442#issuecomment-776452940


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39673/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on pull request #31531: [SPARK-34080][ML][PYTHON][FOLLOW-UP] Update score function in UnivariateFeatureSelector document

2021-02-09 Thread GitBox


zhengruifeng commented on pull request #31531:
URL: https://github.com/apache/spark/pull/31531#issuecomment-776452308


   Late LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29818: [SPARK-32953][PYTHON] Add Arrow self_destruct support to toPandas

2021-02-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29818:
URL: https://github.com/apache/spark/pull/29818#issuecomment-776451202


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135086/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-09 Thread GitBox


AmplabJenkins removed a comment on pull request #31468:
URL: https://github.com/apache/spark/pull/31468#issuecomment-776451204


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39672/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work

2021-02-09 Thread GitBox


AmplabJenkins removed a comment on pull request #31337:
URL: https://github.com/apache/spark/pull/31337#issuecomment-776451205


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39670/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31450: [SPARK-33763] Add metrics for better tracking of dynamic allocation

2021-02-09 Thread GitBox


AmplabJenkins removed a comment on pull request #31450:
URL: https://github.com/apache/spark/pull/31450#issuecomment-776451201







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29818: [SPARK-32953][PYTHON] Add Arrow self_destruct support to toPandas

2021-02-09 Thread GitBox


AmplabJenkins commented on pull request #29818:
URL: https://github.com/apache/spark/pull/29818#issuecomment-776451202


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135086/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work

2021-02-09 Thread GitBox


AmplabJenkins commented on pull request #31337:
URL: https://github.com/apache/spark/pull/31337#issuecomment-776451205


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39670/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-09 Thread GitBox


AmplabJenkins commented on pull request #31468:
URL: https://github.com/apache/spark/pull/31468#issuecomment-776451204


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39672/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31450: [SPARK-33763] Add metrics for better tracking of dynamic allocation

2021-02-09 Thread GitBox


AmplabJenkins commented on pull request #31450:
URL: https://github.com/apache/spark/pull/31450#issuecomment-776451201







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

2021-02-09 Thread GitBox


cloud-fan commented on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-776449191


   @holdenk Can we make sure PRs are merged with at least one approval from 
committers? And also please enrich the PR description a bit more: I don't see 
where the delegation happens. This PR simply removes the name check in 
`SessionCatalogAndIdentifier`.
   
   I'm not going to revert it as the behavior change seems only to happen in 
the error message. But I think we should explain clearly how we handle invalid 
identifiers now. There are two different errors: `Unsupported function name 
...` and `Table or view not found ...`, and I'm curious about what leads to 
this difference.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-09 Thread GitBox


SparkQA commented on pull request #31468:
URL: https://github.com/apache/spark/pull/31468#issuecomment-776447930


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39672/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-09 Thread GitBox


SparkQA commented on pull request #31468:
URL: https://github.com/apache/spark/pull/31468#issuecomment-776446396


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39672/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

2021-02-09 Thread GitBox


cloud-fan commented on a change in pull request #31427:
URL: https://github.com/apache/spark/pull/31427#discussion_r573453436



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
##
@@ -2216,27 +2213,46 @@ class DataSourceV2SQLSuite
 sql("CREATE TABLE t USING json AS SELECT 1 AS i")
 
 val t = "spark_catalog.t"
+
 def verify(sql: String): Unit = {
   val e = intercept[AnalysisException](spark.sql(sql))
-  assert(e.message.contains(
-s"The namespace in session catalog must have exactly one name 
part: $t"))
+  assert(e.message.contains(s"Table or view not found: $t"),

Review comment:
   The error message is confusing now because there is a table `t` and 
Spark still complains that `Table or view not found`. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29818: [SPARK-32953][PYTHON] Add Arrow self_destruct support to toPandas

2021-02-09 Thread GitBox


SparkQA removed a comment on pull request #29818:
URL: https://github.com/apache/spark/pull/29818#issuecomment-776347207


   **[Test build #135086 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135086/testReport)**
 for PR 29818 at commit 
[`64d0301`](https://github.com/apache/spark/commit/64d03012616c9bc56b97693d1fdf8132493deb0e).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29818: [SPARK-32953][PYTHON] Add Arrow self_destruct support to toPandas

2021-02-09 Thread GitBox


SparkQA commented on pull request #29818:
URL: https://github.com/apache/spark/pull/29818#issuecomment-776444897


   **[Test build #135086 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135086/testReport)**
 for PR 29818 at commit 
[`64d0301`](https://github.com/apache/spark/commit/64d03012616c9bc56b97693d1fdf8132493deb0e).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31450: [SPARK-33763] Add metrics for better tracking of dynamic allocation

2021-02-09 Thread GitBox


SparkQA commented on pull request #31450:
URL: https://github.com/apache/spark/pull/31450#issuecomment-776444351


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39669/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #31450: [SPARK-33763] Add metrics for better tracking of dynamic allocation

2021-02-09 Thread GitBox


SparkQA removed a comment on pull request #31450:
URL: https://github.com/apache/spark/pull/31450#issuecomment-776398704


   **[Test build #135087 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135087/testReport)**
 for PR 31450 at commit 
[`7eb3df5`](https://github.com/apache/spark/commit/7eb3df53aae53cd8c28c69d9bf5bff40abf64e70).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31450: [SPARK-33763] Add metrics for better tracking of dynamic allocation

2021-02-09 Thread GitBox


SparkQA commented on pull request #31450:
URL: https://github.com/apache/spark/pull/31450#issuecomment-776443862


   **[Test build #135087 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135087/testReport)**
 for PR 31450 at commit 
[`7eb3df5`](https://github.com/apache/spark/commit/7eb3df53aae53cd8c28c69d9bf5bff40abf64e70).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm edited a comment on pull request #31540: [SPARK-20977][CORE] Use a non-final field for the state of CollectionAccumulator

2021-02-09 Thread GitBox


mridulm edited a comment on pull request #31540:
URL: https://github.com/apache/spark/pull/31540#issuecomment-776442655


   This does not necessarily solve the issue that @zsxwing detailed - the issue 
here is `registerAccumulator` should not be called in `readObject` before 
subclasses have completed readObject.
   
   One possible solution would be to introduce two methods.
   
   a) A protected method `doHandleDriverSideAccumulator()` in `AccumulatorV2` - 
which has all the code after `defaultReadObject` in readObject.
   b) Call `handleDriverSideAccumulator` after `defaultReadObject` in  
`AccumulatorV2`. In `AccumulatorV2`, this protected method will simply delegate 
to `doHandleDriverSideAccumulator`.
   c) In subclasses with local state, override `doHandleDriverSideAccumulator` 
to make it do nothing - and after readObject in subclass is done, invoke 
`doHandleDriverSideAccumulator`
   
   This will ensure AccumulatorV2 and subclasses will register only after state 
has been initialized.
   (Rough sketch, please change logic/names/etc as relevant).
   
   Note, there are other accumulators with local state; we should do this for 
all.
   Thoughts ?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm edited a comment on pull request #31540: [SPARK-20977][CORE] Use a non-final field for the state of CollectionAccumulator

2021-02-09 Thread GitBox


mridulm edited a comment on pull request #31540:
URL: https://github.com/apache/spark/pull/31540#issuecomment-776442655


   This does not necessarily solve the issue that @zsxwing detailed - the issue 
here is `registerAccumulator` should not be called in `readObject` before 
subclasses have completed readObject.
   
   One possible solution would be to introduce two methods.
   
   a) A protected method `doHandleDriverSideAccumulator()` in `AccumulatorV2` - 
which has all the code after `defaultReadObject` in readObject.
   b) Call `handleDriverSideAccumulator` after `defaultReadObject` in  
`AccumulatorV2`. In `AccumulatorV2`, this protected method will simply delegate 
to `doHandleDriverSideAccumulator`.
   c) In subclasses with local state, override `doHandleDriverSideAccumulator` 
to make it do nothing - and after readObject in subclass is done, invoke 
`doHandleDriverSideAccumulator`
   
   This will ensure AccumulatorV2 and subclasses will register only after state 
has been initialized.
   (Rough sketch, please change logic/names/etc as relevant).
   
   Thoughts ?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm commented on pull request #31540: [SPARK-20977][CORE] Use a non-final field for the state of CollectionAccumulator

2021-02-09 Thread GitBox


mridulm commented on pull request #31540:
URL: https://github.com/apache/spark/pull/31540#issuecomment-776442655


   This does not necessarily solve the issue that @zsxwing detailed - the issue 
here is `registerAccumulator` should not be called in `readObject` before 
subclasses have completed readObject.
   
   One possible solution would be to introduce two methods.
   
   a) A protected method `doHandleDriverSideAccumulator()` - which has all the 
code after `defaultReadObject` in readObject.
   b) Call `handleDriverSideAccumulator` after `defaultReadObject` in  
`AccumulatorV2`. In `AccumulatorV2`, this protected method will simply delegate 
to `doHandleDriverSideAccumulator`.
   c) In subclasses with local state, override `doHandleDriverSideAccumulator` 
to make it do nothing - and after readObject in subclass is done, invoke 
`doHandleDriverSideAccumulator`
   
   This will ensure AccumulatorV2 and subclasses will register only after state 
has been initialized.
   (Rough sketch, please change logic/names/etc as relevant).
   
   Thoughts ?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-09 Thread GitBox


SparkQA commented on pull request #31468:
URL: https://github.com/apache/spark/pull/31468#issuecomment-776440924


   **[Test build #135092 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135092/testReport)**
 for PR 31468 at commit 
[`75c4d92`](https://github.com/apache/spark/commit/75c4d92dc7d24b6698b2eb7e5d103e95e925247a).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] attilapiros edited a comment on pull request #31450: [SPARK-33763] Add metrics for better tracking of dynamic allocation

2021-02-09 Thread GitBox


attilapiros edited a comment on pull request #31450:
URL: https://github.com/apache/spark/pull/31450#issuecomment-776439581


   > Wonderful,...
   
   Especially after it's been spiced up with my painting skills :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya closed pull request #31462: [SPARK-34347][SQL] CatalogImpl.uncacheTable should invalidate in cascade for temp views

2021-02-09 Thread GitBox


viirya closed pull request #31462:
URL: https://github.com/apache/spark/pull/31462


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] attilapiros commented on pull request #31450: [SPARK-33763] Add metrics for better tracking of dynamic allocation

2021-02-09 Thread GitBox


attilapiros commented on pull request #31450:
URL: https://github.com/apache/spark/pull/31450#issuecomment-776439581


   > Wonderful, thanks for picking this up :)
   
   Especially after it's been spiced up with my painting skills 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #31462: [SPARK-34347][SQL] CatalogImpl.uncacheTable should invalidate in cascade for temp views

2021-02-09 Thread GitBox


viirya commented on pull request #31462:
URL: https://github.com/apache/spark/pull/31462#issuecomment-776439223


   Thanks! Merging to master.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work

2021-02-09 Thread GitBox


SparkQA commented on pull request #31337:
URL: https://github.com/apache/spark/pull/31337#issuecomment-776434393


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39670/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-09 Thread GitBox


SparkQA commented on pull request #31468:
URL: https://github.com/apache/spark/pull/31468#issuecomment-776434334


   **[Test build #135090 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135090/testReport)**
 for PR 31468 at commit 
[`c9a5a81`](https://github.com/apache/spark/commit/c9a5a81d8a6e612bd2c8b7e82c00587edcd70cbc).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] holdenk commented on pull request #31450: [SPARK-33763] Add metrics for better tracking of dynamic allocation

2021-02-09 Thread GitBox


holdenk commented on pull request #31450:
URL: https://github.com/apache/spark/pull/31450#issuecomment-776432824


   Wonderful, thanks for picking this up :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm edited a comment on pull request #30650: [SPARK-24818][CORE] Support delay scheduling for barrier execution

2021-02-09 Thread GitBox


mridulm edited a comment on pull request #30650:
URL: https://github.com/apache/spark/pull/30650#issuecomment-776431823


   The changes look good to me, thanks @Ngone51 !
   Can you merge it when your comments are addressed @tgravescs ? Thx



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-09 Thread GitBox


AmplabJenkins removed a comment on pull request #31468:
URL: https://github.com/apache/spark/pull/31468#issuecomment-776151186


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135075/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm commented on pull request #30650: [SPARK-24818][CORE] Support delay scheduling for barrier execution

2021-02-09 Thread GitBox


mridulm commented on pull request #30650:
URL: https://github.com/apache/spark/pull/30650#issuecomment-776431823


   The changes look good to me, thanks @Ngone51 !



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on a change in pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-09 Thread GitBox


zhengruifeng commented on a change in pull request #31468:
URL: https://github.com/apache/spark/pull/31468#discussion_r573438154



##
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala
##
@@ -200,28 +210,32 @@ case class TakeOrderedAndProjectExec(
   protected override def doExecute(): RDD[InternalRow] = {
 val ord = new LazilyGeneratedOrdering(sortOrder, child.output)
 val childRDD = child.execute()
-val singlePartitionRDD = if (childRDD.getNumPartitions > 1) {
-  val localTopK = childRDD.mapPartitions { iter =>
-org.apache.spark.util.collection.Utils.takeOrdered(iter.map(_.copy()), 
limit)(ord)
-  }
-  new ShuffledRowRDD(
-ShuffleExchangeExec.prepareShuffleDependency(
-  localTopK,
-  child.output,
-  SinglePartition,
-  serializer,
-  writeMetrics),
-readMetrics)
+if (childRDD.getNumPartitions == 0) {
+  sparkContext.parallelize(Array.empty[InternalRow], 1)

Review comment:
   ok, I will update it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm commented on pull request #30650: [SPARK-24818][CORE] Support delay scheduling for barrier execution

2021-02-09 Thread GitBox


mridulm commented on pull request #30650:
URL: https://github.com/apache/spark/pull/30650#issuecomment-776431450


   I agree, I think it is fine.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31442: [SPARK-34333][SQL] Fix PostgresDialect to handle money types properly

2021-02-09 Thread GitBox


SparkQA commented on pull request #31442:
URL: https://github.com/apache/spark/pull/31442#issuecomment-776430570


   **[Test build #135091 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135091/testReport)**
 for PR 31442 at commit 
[`b838218`](https://github.com/apache/spark/commit/b83821806c87ba6d710efe604a7c9d270af9617b).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-09 Thread GitBox


maropu commented on a change in pull request #31468:
URL: https://github.com/apache/spark/pull/31468#discussion_r573435246



##
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala
##
@@ -200,28 +210,32 @@ case class TakeOrderedAndProjectExec(
   protected override def doExecute(): RDD[InternalRow] = {
 val ord = new LazilyGeneratedOrdering(sortOrder, child.output)
 val childRDD = child.execute()
-val singlePartitionRDD = if (childRDD.getNumPartitions > 1) {
-  val localTopK = childRDD.mapPartitions { iter =>
-org.apache.spark.util.collection.Utils.takeOrdered(iter.map(_.copy()), 
limit)(ord)
-  }
-  new ShuffledRowRDD(
-ShuffleExchangeExec.prepareShuffleDependency(
-  localTopK,
-  child.output,
-  SinglePartition,
-  serializer,
-  writeMetrics),
-readMetrics)
+if (childRDD.getNumPartitions == 0) {
+  sparkContext.parallelize(Array.empty[InternalRow], 1)

Review comment:
   nit: how about using `ParallelCollectionRDD` directly?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sarutak commented on pull request #31442: [SPARK-34333][SQL] Fix PostgresDialect to handle money types properly

2021-02-09 Thread GitBox


sarutak commented on pull request #31442:
URL: https://github.com/apache/spark/pull/31442#issuecomment-776427904


   @maropu Thanks. I've updated.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on a change in pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-09 Thread GitBox


zhengruifeng commented on a change in pull request #31468:
URL: https://github.com/apache/spark/pull/31468#discussion_r573433676



##
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala
##
@@ -52,16 +52,21 @@ case class CollectLimitExec(limit: Int, child: SparkPlan) 
extends LimitExec {
 SQLShuffleReadMetricsReporter.createShuffleReadMetrics(sparkContext)
   override lazy val metrics = readMetrics ++ writeMetrics
   protected override def doExecute(): RDD[InternalRow] = {
-val locallyLimited = child.execute().mapPartitionsInternal(_.take(limit))
-val shuffled = new ShuffledRowRDD(
-  ShuffleExchangeExec.prepareShuffleDependency(
-locallyLimited,
-child.output,
-SinglePartition,
-serializer,
-writeMetrics),
-  readMetrics)
-shuffled.mapPartitionsInternal(_.take(limit))
+val childRDD = child.execute()
+val singlePartitionRDD = if (childRDD.getNumPartitions != 1) {

Review comment:
   refer to 
[CoalesceExec.doExecute](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala#L715),
 I update the fix to directly return empty RDD with single partition if 
`childRDD.getNumPartitions == 0`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #31491: [SPARK-34379][SQL] Map JDBC RowID to StringType rather than LongType

2021-02-09 Thread GitBox


maropu commented on pull request #31491:
URL: https://github.com/apache/spark/pull/31491#issuecomment-776425851


   Could you add tests in the `core` package?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on a change in pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-09 Thread GitBox


zhengruifeng commented on a change in pull request #31468:
URL: https://github.com/apache/spark/pull/31468#discussion_r573432645



##
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala
##
@@ -200,7 +205,7 @@ case class TakeOrderedAndProjectExec(
   protected override def doExecute(): RDD[InternalRow] = {
 val ord = new LazilyGeneratedOrdering(sortOrder, child.output)
 val childRDD = child.execute()
-val singlePartitionRDD = if (childRDD.getNumPartitions > 1) {
+val singlePartitionRDD = if (childRDD.getNumPartitions != 1) {

Review comment:
   There is no separate testsuite for `CollectLimitExec`, I added case 
`childRDD.getNumPartitions == 0` and case `childRDD.getNumPartitions == 1` in 
`TakeOrderedAndProjectSuite`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work

2021-02-09 Thread GitBox


SparkQA commented on pull request #31337:
URL: https://github.com/apache/spark/pull/31337#issuecomment-776425676


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39670/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #31491: [SPARK-34379][SQL] Map JDBC RowID to StringType rather than LongType

2021-02-09 Thread GitBox


maropu commented on a change in pull request #31491:
URL: https://github.com/apache/spark/pull/31491#discussion_r573431650



##
File path: docs/sql-migration-guide.md
##
@@ -24,6 +24,8 @@ license: |
 
 ## Upgrading from Spark SQL 3.1 to 3.2
 
+  - Since Spark 3.2, all the supported JDBC dialects use StringType for ROWID. 
Previously, Oracle dialect uses StringType and the other dialects use LongType.

Review comment:
   nit: Previously, => In Spark 3.1 or earlier,





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #31442: [SPARK-34333][SQL] Fix PostgresDialect to handle money types properly

2021-02-09 Thread GitBox


maropu commented on a change in pull request #31442:
URL: https://github.com/apache/spark/pull/31442#discussion_r573431597



##
File path: docs/sql-migration-guide.md
##
@@ -24,6 +24,8 @@ license: |
 
 ## Upgrading from Spark SQL 3.1 to 3.2
 
+  - In Spark 3.2, PostgreSQL JDBC dialect uses StringType for MONEY and 
MONEY[] is not supported due to the JDBC driver for PostgreSQL can't handle 
those types properly. Previously, DoubleType and ArrayType of DoubleType are 
used respectively.

Review comment:
   nit: `Previously, ` => `In Spark 3.1 or earlier,`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] attilapiros commented on pull request #31450: [SPARK-33763] Add metrics for better tracking of dynamic allocation

2021-02-09 Thread GitBox


attilapiros commented on pull request #31450:
URL: https://github.com/apache/spark/pull/31450#issuecomment-776422517


   cc. @holdenk 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work

2021-02-09 Thread GitBox


SparkQA commented on pull request #31337:
URL: https://github.com/apache/spark/pull/31337#issuecomment-776419676


   **[Test build #135089 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135089/testReport)**
 for PR 31337 at commit 
[`ebc738b`](https://github.com/apache/spark/commit/ebc738b64b75ccbd0fe1fd406f12b7676816877a).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu closed pull request #31500: [SPARK-34388][SQL] Propagate the registered UDF name to ScalaUDF, ScalaUDAF and ScalaAggregator

2021-02-09 Thread GitBox


maropu closed pull request #31500:
URL: https://github.com/apache/spark/pull/31500


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #31500: [SPARK-34388][SQL] Propagate the registered UDF name to ScalaUDF, ScalaUDAF and ScalaAggregator

2021-02-09 Thread GitBox


maropu commented on pull request #31500:
URL: https://github.com/apache/spark/pull/31500#issuecomment-776419198


   It seems this PR already has been merged, so I'll close this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #31442: [SPARK-34333][SQL] Fix PostgresDialect to handle money types properly

2021-02-09 Thread GitBox


maropu commented on pull request #31442:
URL: https://github.com/apache/spark/pull/31442#issuecomment-776418535


   Could you resolve the conflict? Looks fine otherwise.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum commented on pull request #31528: [SPARK-34403][SQL]Remove dependency to commons-httpclient

2021-02-09 Thread GitBox


wangyum commented on pull request #31528:
URL: https://github.com/apache/spark/pull/31528#issuecomment-776414106


   
[`hive-exec`](https://github.com/apache/hive/blob/rel/release-2.3.8/ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java#L34)
 need this dependency.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #31384: [SPARK-31816][SQL][DOCS] Added high level description about JDBC connection providers for users/developers

2021-02-09 Thread GitBox


maropu commented on pull request #31384:
URL: https://github.com/apache/spark/pull/31384#issuecomment-776413210


   Thanks! Merged to master.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu closed pull request #31384: [SPARK-31816][SQL][DOCS] Added high level description about JDBC connection providers for users/developers

2021-02-09 Thread GitBox


maropu closed pull request #31384:
URL: https://github.com/apache/spark/pull/31384


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31450: [WIP][SPARK-33763] Add metrics for better tracking of dynamic allocation

2021-02-09 Thread GitBox


SparkQA commented on pull request #31450:
URL: https://github.com/apache/spark/pull/31450#issuecomment-776413082


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39669/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan closed pull request #31073: [SPARK-33995][SQL] Expose make_interval as a Scala function

2021-02-09 Thread GitBox


cloud-fan closed pull request #31073:
URL: https://github.com/apache/spark/pull/31073


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #31073: [SPARK-33995][SQL] Expose make_interval as a Scala function

2021-02-09 Thread GitBox


cloud-fan commented on pull request #31073:
URL: https://github.com/apache/spark/pull/31073#issuecomment-776412824


   thanks, merging to master!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #31384: [SPARK-31816][SQL][DOCS] Added high level description about JDBC connection providers for users/developers

2021-02-09 Thread GitBox


maropu commented on pull request #31384:
URL: https://github.com/apache/spark/pull/31384#issuecomment-776412764


   Sorry for my late reply... yea. thanks for the work, @gaborgsomogyi . Looks 
good now. I'll merge this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on pull request #30957: [SPARK-31937][SQL] Support processing ArrayType/MapType/StructType data using no-serde mode script transform

2021-02-09 Thread GitBox


AngersZh commented on pull request #30957:
URL: https://github.com/apache/spark/pull/30957#issuecomment-776412512


   Gentle ping @maropu @cloud-fan  



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   >