[GitHub] [spark] LuciferYang commented on pull request #29598: [SPARK-32755][SQL] Maintain the order of expressions in AttributeSet and ExpressionSet

2020-09-07 Thread GitBox
LuciferYang commented on pull request #29598: URL: https://github.com/apache/spark/pull/29598#issuecomment-688661013 @cloud-fan @Ngone51 we can use https://github.com/apache/spark/pull/29660 This is an automated message fro

[GitHub] [spark] cloud-fan commented on a change in pull request #29589: [SPARK-32748][SQL] Support local property propagation in SubqueryBroadcastExec

2020-09-07 Thread GitBox
cloud-fan commented on a change in pull request #29589: URL: https://github.com/apache/spark/pull/29589#discussion_r484691881 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SubqueryBroadcastExec.scala ## @@ -60,10 +63,12 @@ case class SubqueryBroadcastExe

[GitHub] [spark] cloud-fan commented on pull request #29598: [SPARK-32755][SQL] Maintain the order of expressions in AttributeSet and ExpressionSet

2020-09-07 Thread GitBox
cloud-fan commented on pull request #29598: URL: https://github.com/apache/spark/pull/29598#issuecomment-688658978 @LuciferYang do you have a branch that contains the compilation fix? This is an automated message from the Apa

[GitHub] [spark] LuciferYang commented on pull request #29598: [SPARK-32755][SQL] Maintain the order of expressions in AttributeSet and ExpressionSet

2020-09-07 Thread GitBox
LuciferYang commented on pull request #29598: URL: https://github.com/apache/spark/pull/29598#issuecomment-688653113 @Ngone51 need some simple fix on compilation for Scala 2.13 , `QueryPlan` and `ShuffleBlockFetcherIterator`. --

[GitHub] [spark] wzhfy commented on a change in pull request #29589: [SPARK-32748][SQL] Support local property propagation in SubqueryBroadcastExec

2020-09-07 Thread GitBox
wzhfy commented on a change in pull request #29589: URL: https://github.com/apache/spark/pull/29589#discussion_r484684202 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SubqueryBroadcastExec.scala ## @@ -60,10 +63,12 @@ case class SubqueryBroadcastExec(

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29665: [SPARK-32753][SQL][3.0] Only copy tags to node with no tags

2020-09-07 Thread GitBox
dongjoon-hyun commented on a change in pull request #29665: URL: https://github.com/apache/spark/pull/29665#discussion_r484683489 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala ## @@ -889,4 +889,18 @@ class AdaptiveQu

[GitHub] [spark] dongjoon-hyun commented on pull request #29665: [SPARK-32753][SQL][3.0] Only copy tags to node with no tags

2020-09-07 Thread GitBox
dongjoon-hyun commented on pull request #29665: URL: https://github.com/apache/spark/pull/29665#issuecomment-688650945 @manuzhang and @cloud-fan . If this is a kind of a correctness issue, could you add a label to JIRA, please? -

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29665: [SPARK-32753][SQL][3.0] Only copy tags to node with no tags

2020-09-07 Thread GitBox
dongjoon-hyun commented on a change in pull request #29665: URL: https://github.com/apache/spark/pull/29665#discussion_r484682143 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala ## @@ -889,4 +889,18 @@ class AdaptiveQu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29065: [WIP][SPARK-32268][SQL] Bloom Filter Join

2020-09-07 Thread GitBox
AmplabJenkins removed a comment on pull request #29065: URL: https://github.com/apache/spark/pull/29065#issuecomment-688647916 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] cloud-fan closed pull request #29658: [SPARK-32785][SQL][3.0] Interval with dangling parts should not result null

2020-09-07 Thread GitBox
cloud-fan closed pull request #29658: URL: https://github.com/apache/spark/pull/29658 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29658: [SPARK-32785][SQL][3.0] Interval with dangling parts should not result null

2020-09-07 Thread GitBox
AmplabJenkins removed a comment on pull request #29658: URL: https://github.com/apache/spark/pull/29658#issuecomment-688647888 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] AmplabJenkins commented on pull request #29065: [WIP][SPARK-32268][SQL] Bloom Filter Join

2020-09-07 Thread GitBox
AmplabJenkins commented on pull request #29065: URL: https://github.com/apache/spark/pull/29065#issuecomment-688647916 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #29658: [SPARK-32785][SQL][3.0] Interval with dangling parts should not result null

2020-09-07 Thread GitBox
AmplabJenkins commented on pull request #29658: URL: https://github.com/apache/spark/pull/29658#issuecomment-688647888 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] cloud-fan commented on pull request #29658: [SPARK-32785][SQL][3.0] Interval with dangling parts should not result null

2020-09-07 Thread GitBox
cloud-fan commented on pull request #29658: URL: https://github.com/apache/spark/pull/29658#issuecomment-688647688 github action passes, thanks, merging to 3.0! This is an automated message from the Apache Git Service. To res

[GitHub] [spark] SparkQA commented on pull request #29065: [WIP][SPARK-32268][SQL] Bloom Filter Join

2020-09-07 Thread GitBox
SparkQA commented on pull request #29065: URL: https://github.com/apache/spark/pull/29065#issuecomment-688647269 **[Test build #128383 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128383/testReport)** for PR 29065 at commit [`1a8cc9b`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #29658: [SPARK-32785][SQL][3.0] Interval with dangling parts should not result null

2020-09-07 Thread GitBox
SparkQA commented on pull request #29658: URL: https://github.com/apache/spark/pull/29658#issuecomment-688647202 **[Test build #128382 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128382/testReport)** for PR 29658 at commit [`edcd822`](https://github.com

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29667: [SPARK-32813][SQL] Get default config of vectorized reader if no active SparkSession

2020-09-07 Thread GitBox
dongjoon-hyun commented on a change in pull request #29667: URL: https://github.com/apache/spark/pull/29667#discussion_r484679330 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/SQLExecutionSuite.scala ## @@ -119,6 +125,39 @@ class SQLExecutionSuite extend

[GitHub] [spark] yaooqinn commented on pull request #29658: [SPARK-32785][SQL][3.0] Interval with dangling parts should not result null

2020-09-07 Thread GitBox
yaooqinn commented on pull request #29658: URL: https://github.com/apache/spark/pull/29658#issuecomment-688644885 retest this please This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] dongjoon-hyun commented on pull request #29605: [SPARK-31511][SQL][2.4] Make BytesToBytesMap iterators thread-safe

2020-09-07 Thread GitBox
dongjoon-hyun commented on pull request #29605: URL: https://github.com/apache/spark/pull/29605#issuecomment-688644569 Thank you for the release preparation, @ScrapCodes . This is an automated message from the Apache Git Serv

[GitHub] [spark] cloud-fan commented on pull request #29598: [SPARK-32755][SQL] Maintain the order of expressions in AttributeSet and ExpressionSet

2020-09-07 Thread GitBox
cloud-fan commented on pull request #29598: URL: https://github.com/apache/spark/pull/29598#issuecomment-688638563 Interesting. So `AttributeSet` and `ExpressionSet` behave differently under scala 2.12 and 2.13. @Ngone51 can you take a look? --

[GitHub] [spark] cloud-fan commented on a change in pull request #29589: [SPARK-32748][SQL] Support local property propagation in SubqueryBroadcastExec

2020-09-07 Thread GitBox
cloud-fan commented on a change in pull request #29589: URL: https://github.com/apache/spark/pull/29589#discussion_r484669025 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SubqueryBroadcastExec.scala ## @@ -60,10 +63,12 @@ case class SubqueryBroadcastExe

[GitHub] [spark] maropu commented on a change in pull request #29589: [SPARK-32748][SQL] Support local property propagation in SubqueryBroadcastExec

2020-09-07 Thread GitBox
maropu commented on a change in pull request #29589: URL: https://github.com/apache/spark/pull/29589#discussion_r484669248 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SubqueryBroadcastExec.scala ## @@ -60,10 +63,12 @@ case class SubqueryBroadcastExec(

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29660: [SPARK-32808][SQL] Fix some test cases of `sql/core` module in scala 2.13

2020-09-07 Thread GitBox
AmplabJenkins removed a comment on pull request #29660: URL: https://github.com/apache/spark/pull/29660#issuecomment-688634013 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29660: [SPARK-32808][SQL] Fix some test cases of `sql/core` module in scala 2.13

2020-09-07 Thread GitBox
AmplabJenkins commented on pull request #29660: URL: https://github.com/apache/spark/pull/29660#issuecomment-688634013 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #29660: [SPARK-32808][SQL] Fix some test cases of `sql/core` module in scala 2.13

2020-09-07 Thread GitBox
SparkQA commented on pull request #29660: URL: https://github.com/apache/spark/pull/29660#issuecomment-688633651 **[Test build #128381 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128381/testReport)** for PR 29660 at commit [`454b53c`](https://github.com

[GitHub] [spark] LuciferYang commented on pull request #29660: [SPARK-32808][SQL] Fix some test cases of `sql/core` module in scala 2.13

2020-09-07 Thread GitBox
LuciferYang commented on pull request #29660: URL: https://github.com/apache/spark/pull/29660#issuecomment-688632207 Address 454b53c merge upstream master and resolve conflict file This is an automated message from the Apache

[GitHub] [spark] zero323 commented on pull request #29591: [SPARK-32714][PYTHON] Initial pyspark-stubs port.

2020-09-07 Thread GitBox
zero323 commented on pull request #29591: URL: https://github.com/apache/spark/pull/29591#issuecomment-688630650 __Note__ In `pyspark-stubs` we use data driven test cases ([scenarios](https://github.com/zero323/pyspark-stubs/tree/master/test-data/unit) and [runner](https://github.

[GitHub] [spark] LuciferYang edited a comment on pull request #29598: [SPARK-32755][SQL] Maintain the order of expressions in AttributeSet and ExpressionSet

2020-09-07 Thread GitBox
LuciferYang edited a comment on pull request #29598: URL: https://github.com/apache/spark/pull/29598#issuecomment-688627206 So always need to re-generate golden files with Scala 2.13? Or we need to use different golden files for different Scala verision, feels a little unreasonable...

[GitHub] [spark] wzhfy commented on a change in pull request #29589: [SPARK-32748][SQL] Support local property propagation in SubqueryBroadcastExec

2020-09-07 Thread GitBox
wzhfy commented on a change in pull request #29589: URL: https://github.com/apache/spark/pull/29589#discussion_r484657207 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SubqueryBroadcastExec.scala ## @@ -60,10 +63,12 @@ case class SubqueryBroadcastExec(

[GitHub] [spark] LuciferYang commented on pull request #29598: [SPARK-32755][SQL] Maintain the order of expressions in AttributeSet and ExpressionSet

2020-09-07 Thread GitBox
LuciferYang commented on pull request #29598: URL: https://github.com/apache/spark/pull/29598#issuecomment-688627206 So always need to re-generate golden files with Scala 2.13? Or we need to use different golden files for different Scala verision, feels a little unreasonable...

[GitHub] [spark] wzhfy commented on a change in pull request #29589: [SPARK-32748][SQL] Support local property propagation in SubqueryBroadcastExec

2020-09-07 Thread GitBox
wzhfy commented on a change in pull request #29589: URL: https://github.com/apache/spark/pull/29589#discussion_r484657207 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SubqueryBroadcastExec.scala ## @@ -60,10 +63,12 @@ case class SubqueryBroadcastExec(

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29669: [SPARK-31511][FOLLOW-UP][TEST][SQL] Make BytesToBytesMap iterators thread-safe

2020-09-07 Thread GitBox
AmplabJenkins removed a comment on pull request #29669: URL: https://github.com/apache/spark/pull/29669#issuecomment-688626498 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29669: [SPARK-31511][FOLLOW-UP][TEST][SQL] Make BytesToBytesMap iterators thread-safe

2020-09-07 Thread GitBox
AmplabJenkins commented on pull request #29669: URL: https://github.com/apache/spark/pull/29669#issuecomment-688626498 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29658: [SPARK-32785][SQL][3.0] Interval with dangling parts should not result null

2020-09-07 Thread GitBox
AmplabJenkins removed a comment on pull request #29658: URL: https://github.com/apache/spark/pull/29658#issuecomment-688626020 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128

[GitHub] [spark] wzhfy commented on a change in pull request #29589: [SPARK-32748][SQL] Support local property propagation in SubqueryBroadcastExec

2020-09-07 Thread GitBox
wzhfy commented on a change in pull request #29589: URL: https://github.com/apache/spark/pull/29589#discussion_r484657832 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SubqueryBroadcastExec.scala ## @@ -60,10 +63,12 @@ case class SubqueryBroadcastExec(

[GitHub] [spark] SparkQA commented on pull request #29669: [SPARK-31511][FOLLOW-UP][TEST][SQL] Make BytesToBytesMap iterators thread-safe

2020-09-07 Thread GitBox
SparkQA commented on pull request #29669: URL: https://github.com/apache/spark/pull/29669#issuecomment-688626087 **[Test build #128380 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128380/testReport)** for PR 29669 at commit [`fb9489f`](https://github.com

[GitHub] [spark] wzhfy commented on a change in pull request #29589: [SPARK-32748][SQL] Support local property propagation in SubqueryBroadcastExec

2020-09-07 Thread GitBox
wzhfy commented on a change in pull request #29589: URL: https://github.com/apache/spark/pull/29589#discussion_r484657832 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SubqueryBroadcastExec.scala ## @@ -60,10 +63,12 @@ case class SubqueryBroadcastExec(

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29658: [SPARK-32785][SQL][3.0] Interval with dangling parts should not result null

2020-09-07 Thread GitBox
AmplabJenkins removed a comment on pull request #29658: URL: https://github.com/apache/spark/pull/29658#issuecomment-688626014 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] AmplabJenkins commented on pull request #29658: [SPARK-32785][SQL][3.0] Interval with dangling parts should not result null

2020-09-07 Thread GitBox
AmplabJenkins commented on pull request #29658: URL: https://github.com/apache/spark/pull/29658#issuecomment-688626014 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] LuciferYang commented on pull request #29598: [SPARK-32755][SQL] Maintain the order of expressions in AttributeSet and ExpressionSet

2020-09-07 Thread GitBox
LuciferYang commented on pull request #29598: URL: https://github.com/apache/spark/pull/29598#issuecomment-688625923 @cloud-fan Maybe I didn't describe it clearly, now I use the master of spark-source to execute maven test with Scala 2.12 ``` mvn clean test -pl sql/core -Dtest=no

[GitHub] [spark] SparkQA removed a comment on pull request #29658: [SPARK-32785][SQL][3.0] Interval with dangling parts should not result null

2020-09-07 Thread GitBox
SparkQA removed a comment on pull request #29658: URL: https://github.com/apache/spark/pull/29658#issuecomment-688573747 **[Test build #128371 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128371/testReport)** for PR 29658 at commit [`edcd822`](https://gi

[GitHub] [spark] wzhfy commented on a change in pull request #29589: [SPARK-32748][SQL] Support local property propagation in SubqueryBroadcastExec

2020-09-07 Thread GitBox
wzhfy commented on a change in pull request #29589: URL: https://github.com/apache/spark/pull/29589#discussion_r484657207 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SubqueryBroadcastExec.scala ## @@ -60,10 +63,12 @@ case class SubqueryBroadcastExec(

[GitHub] [spark] SparkQA commented on pull request #29658: [SPARK-32785][SQL][3.0] Interval with dangling parts should not result null

2020-09-07 Thread GitBox
SparkQA commented on pull request #29658: URL: https://github.com/apache/spark/pull/29658#issuecomment-688625549 **[Test build #128371 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128371/testReport)** for PR 29658 at commit [`edcd822`](https://github.co

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29669: [SPARK-31511][FOLLOW-UP][TEST][SQL] Make BytesToBytesMap iterators thread-safe

2020-09-07 Thread GitBox
AmplabJenkins removed a comment on pull request #29669: URL: https://github.com/apache/spark/pull/29669#issuecomment-688620349 Can one of the admins verify this patch? This is an automated message from the Apache Git Service.

[GitHub] [spark] cloud-fan commented on pull request #29669: [SPARK-31511][FOLLOW-UP][TEST][SQL] Make BytesToBytesMap iterators thread-safe

2020-09-07 Thread GitBox
cloud-fan commented on pull request #29669: URL: https://github.com/apache/spark/pull/29669#issuecomment-688624276 ok to test This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29586: [SPARK-32743][SQL] Add distinct info at UnresolvedFunction toString

2020-09-07 Thread GitBox
AmplabJenkins removed a comment on pull request #29586: URL: https://github.com/apache/spark/pull/29586#issuecomment-688620505 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29669: [SPARK-31511][FOLLOW-UP][TEST][SQL] Make BytesToBytesMap iterators thread-safe

2020-09-07 Thread GitBox
AmplabJenkins commented on pull request #29669: URL: https://github.com/apache/spark/pull/29669#issuecomment-688620349 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To resp

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29669: [SPARK-31511][FOLLOW-UP][TEST][SQL] Make BytesToBytesMap iterators thread-safe

2020-09-07 Thread GitBox
AmplabJenkins removed a comment on pull request #29669: URL: https://github.com/apache/spark/pull/29669#issuecomment-688620039 Can one of the admins verify this patch? This is an automated message from the Apache Git Service.

[GitHub] [spark] AmplabJenkins commented on pull request #29586: [SPARK-32743][SQL] Add distinct info at UnresolvedFunction toString

2020-09-07 Thread GitBox
AmplabJenkins commented on pull request #29586: URL: https://github.com/apache/spark/pull/29586#issuecomment-688620505 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #29586: [SPARK-32743][SQL] Add distinct info at UnresolvedFunction toString

2020-09-07 Thread GitBox
SparkQA commented on pull request #29586: URL: https://github.com/apache/spark/pull/29586#issuecomment-688620214 **[Test build #128379 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128379/testReport)** for PR 29586 at commit [`7f112c9`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #29669: [SPARK-31511][FOLLOW-UP][TEST][SQL] Make BytesToBytesMap iterators thread-safe

2020-09-07 Thread GitBox
AmplabJenkins commented on pull request #29669: URL: https://github.com/apache/spark/pull/29669#issuecomment-688620039 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To resp

[GitHub] [spark] cxzl25 commented on a change in pull request #29605: [SPARK-31511][SQL][2.4] Make BytesToBytesMap iterators thread-safe

2020-09-07 Thread GitBox
cxzl25 commented on a change in pull request #29605: URL: https://github.com/apache/spark/pull/29605#discussion_r484650666 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/joins/HashedRelationSuite.scala ## @@ -341,6 +341,45 @@ class HashedRelationSuite ext

[GitHub] [spark] cxzl25 opened a new pull request #29669: [SPARK-31511][FOLLOW-UP][TEST][SQL] Make BytesToBytesMap iterators thread-safe

2020-09-07 Thread GitBox
cxzl25 opened a new pull request #29669: URL: https://github.com/apache/spark/pull/29669 ### What changes were proposed in this pull request? Before SPARK-31511 is fixed, `BytesToBytesMap` iterator() is not thread-safe and may cause data inaccuracy. We need to add a unit test.

[GitHub] [spark] cloud-fan commented on pull request #29598: [SPARK-32755][SQL] Maintain the order of expressions in AttributeSet and ExpressionSet

2020-09-07 Thread GitBox
cloud-fan commented on pull request #29598: URL: https://github.com/apache/spark/pull/29598#issuecomment-688618849 This should have been merged before we have `PlanStabilitySuite`, as the query plans in golden files were kind of random previously. That's why this PR updates `PlanStabilityS

[GitHub] [spark] LuciferYang commented on pull request #29598: [SPARK-32755][SQL] Maintain the order of expressions in AttributeSet and ExpressionSet

2020-09-07 Thread GitBox
LuciferYang commented on pull request #29598: URL: https://github.com/apache/spark/pull/29598#issuecomment-688615468 Sorry to leave a message in a completed issue. @cloud-fan @dbaliafroozeh This patch seems to bring about some different behavior between use Scala 2.12 and Scala 2.13.

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29667: [SPARK-32813][SQL] Get default config of vectorized reader if no active SparkSession

2020-09-07 Thread GitBox
HyukjinKwon commented on a change in pull request #29667: URL: https://github.com/apache/spark/pull/29667#discussion_r484646383 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/SQLExecutionSuite.scala ## @@ -119,6 +125,39 @@ class SQLExecutionSuite extends

[GitHub] [spark] HyukjinKwon edited a comment on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-09-07 Thread GitBox
HyukjinKwon edited a comment on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-688614177 Just for a bit of more contexts, the commits are regularly digested and shared into the dev mailing list after being summarized. The PRs introducing APIs are usually

[GitHub] [spark] cloud-fan commented on pull request #29579: [SPARK-32736][CORE] Avoid caching the removed decommissioned executors in TaskSchedulerImpl

2020-09-07 Thread GitBox
cloud-fan commented on pull request #29579: URL: https://github.com/apache/spark/pull/29579#issuecomment-688614265 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [spark] cloud-fan closed pull request #29579: [SPARK-32736][CORE] Avoid caching the removed decommissioned executors in TaskSchedulerImpl

2020-09-07 Thread GitBox
cloud-fan closed pull request #29579: URL: https://github.com/apache/spark/pull/29579 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [spark] HyukjinKwon edited a comment on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-09-07 Thread GitBox
HyukjinKwon edited a comment on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-688613392 Just for a bit of more contexts, the commits are regularly digested and shared into the dev mailing list after being summarized. The PRs introducing APIs are usually

[GitHub] [spark] HyukjinKwon removed a comment on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-09-07 Thread GitBox
HyukjinKwon removed a comment on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-688613392 Just for a bit of more contexts, the commits are regularly digested and shared into the dev mailing list after being summarized. The PRs introducing APIs are usually

[GitHub] [spark] HyukjinKwon commented on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-09-07 Thread GitBox
HyukjinKwon commented on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-688614177 Just for a bit of more contexts, the commits are regularly digested and shared into the dev mailing list after being summarized. The PRs introducing APIs are usually directl

[GitHub] [spark] cloud-fan commented on a change in pull request #29667: [SPARK-32813][SQL] Get default config of vectorized reader if no active SparkSession

2020-09-07 Thread GitBox
cloud-fan commented on a change in pull request #29667: URL: https://github.com/apache/spark/pull/29667#discussion_r484645633 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/SQLExecutionSuite.scala ## @@ -119,6 +125,39 @@ class SQLExecutionSuite extends Sp

[GitHub] [spark] HyukjinKwon commented on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-09-07 Thread GitBox
HyukjinKwon commented on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-688613392 Just for a bit of more contexts, the commits are regularly digested and shared into the dev mailing list after being summarized. The PRs introducing APIs are usually directl

[GitHub] [spark] HyukjinKwon edited a comment on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-09-07 Thread GitBox
HyukjinKwon edited a comment on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-688613392 Just for a bit of more contexts, the commits are regularly digested and shared into the dev mailing list after being summarized. The PRs introducing APIs are usually

[GitHub] [spark] ScrapCodes commented on pull request #29605: [SPARK-31511][SQL][2.4] Make BytesToBytesMap iterators thread-safe

2020-09-07 Thread GitBox
ScrapCodes commented on pull request #29605: URL: https://github.com/apache/spark/pull/29605#issuecomment-688613221 Now, I can go ahead and tag the release. Good work. ( A long battle with CI) This is an automated message fro

[GitHub] [spark] dongjoon-hyun commented on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-09-07 Thread GitBox
dongjoon-hyun commented on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-688608918 Hi, @gatorsmile. Is your request just about revising this GitHub PR description, right? We cannot change commit logs. -

[GitHub] [spark] KevinSmile edited a comment on pull request #29653: [SPARK-32804][Launcher] Fix run-example command builder bug

2020-09-07 Thread GitBox
KevinSmile edited a comment on pull request #29653: URL: https://github.com/apache/spark/pull/29653#issuecomment-688562460 I think it only affects run-example in standalone-cluster mode. As standalone-client mode and yarn-cluster/k8s-cluster/... mode have different logics about how t

[GitHub] [spark] imback82 commented on a change in pull request #29655: [SPARK-32806][SQL] SortMergeJoin with partial hash distribution can be optimized to remove shuffle

2020-09-07 Thread GitBox
imback82 commented on a change in pull request #29655: URL: https://github.com/apache/spark/pull/29655#discussion_r484636101 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/OptimizeSortMergeJoinWithPartialHashDistribution.scala ## @@ -0,0 +1,115 @

[GitHub] [spark] dongjoon-hyun closed pull request #29647: [SPARK-32764][SQL] -0.0 should be equal to 0.0

2020-09-07 Thread GitBox
dongjoon-hyun closed pull request #29647: URL: https://github.com/apache/spark/pull/29647 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29647: [SPARK-32764][SQL] -0.0 should be equal to 0.0

2020-09-07 Thread GitBox
dongjoon-hyun commented on a change in pull request #29647: URL: https://github.com/apache/spark/pull/29647#discussion_r484633284 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/SQLOrderingUtilSuite.scala ## @@ -0,0 +1,75 @@ +/* + * Licensed to the

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29640: [SPARK-32180][PYTHON][DOCS] Installation page of Getting Started in PySpark documentation

2020-09-07 Thread GitBox
HyukjinKwon commented on a change in pull request #29640: URL: https://github.com/apache/spark/pull/29640#discussion_r484633037 ## File path: python/docs/source/getting_started/installation.rst ## @@ -0,0 +1,120 @@ +.. Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29647: [SPARK-32764][SQL] -0.0 should be equal to 0.0

2020-09-07 Thread GitBox
dongjoon-hyun commented on a change in pull request #29647: URL: https://github.com/apache/spark/pull/29647#discussion_r484632927 ## File path: sql/core/src/test/resources/sql-tests/inputs/operators.sql ## @@ -95,3 +95,9 @@ select width_bucket(5.35, 0.024, 10.06, null); select

[GitHub] [spark] dongjoon-hyun commented on pull request #29605: [SPARK-31511][SQL][2.4] Make BytesToBytesMap iterators thread-safe

2020-09-07 Thread GitBox
dongjoon-hyun commented on pull request #29605: URL: https://github.com/apache/spark/pull/29605#issuecomment-688600014 Thank you, @cxzl25 and @cloud-fan . This is an automated message from the Apache Git Service. To respond t

[GitHub] [spark] viirya commented on a change in pull request #29667: [SPARK-32813][SQL] Get default config of vectorized reader if no active SparkSession

2020-09-07 Thread GitBox
viirya commented on a change in pull request #29667: URL: https://github.com/apache/spark/pull/29667#discussion_r484632437 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/SQLExecutionSuite.scala ## @@ -119,6 +125,39 @@ class SQLExecutionSuite extends Spark

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29640: [SPARK-32180][PYTHON][DOCS] Installation page of Getting Started in PySpark documentation

2020-09-07 Thread GitBox
HyukjinKwon commented on a change in pull request #29640: URL: https://github.com/apache/spark/pull/29640#discussion_r484632164 ## File path: python/docs/source/getting_started/installation.rst ## @@ -0,0 +1,120 @@ +.. Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [spark] viirya closed pull request #29645: [SPARK-32796][SQL] Make withField API support nested struct in array

2020-09-07 Thread GitBox
viirya closed pull request #29645: URL: https://github.com/apache/spark/pull/29645 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [spark] viirya commented on pull request #29645: [SPARK-32796][SQL] Make withField API support nested struct in array

2020-09-07 Thread GitBox
viirya commented on pull request #29645: URL: https://github.com/apache/spark/pull/29645#issuecomment-688599032 Okay, I see. It also makes sense to me. This is a hard trade-off between simplicity and flexibility. I will close this now. If we need this flexibility in the future, we can revi

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29640: [SPARK-32180][PYTHON][DOCS] Installation page of Getting Started in PySpark documentation

2020-09-07 Thread GitBox
HyukjinKwon commented on a change in pull request #29640: URL: https://github.com/apache/spark/pull/29640#discussion_r484631878 ## File path: python/docs/source/getting_started/installation.rst ## @@ -0,0 +1,120 @@ +.. Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [spark] SparkQA commented on pull request #29667: [SPARK-32813][SQL] Get default config of vectorized reader if no active SparkSession

2020-09-07 Thread GitBox
SparkQA commented on pull request #29667: URL: https://github.com/apache/spark/pull/29667#issuecomment-688598156 **[Test build #128378 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128378/testReport)** for PR 29667 at commit [`b89eccc`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #29178: [SPARK-32380][SQL] fixed spark3.0 access hive table while data in hbase problem

2020-09-07 Thread GitBox
AmplabJenkins commented on pull request #29178: URL: https://github.com/apache/spark/pull/29178#issuecomment-688598014 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To resp

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29640: [SPARK-32180][PYTHON][DOCS] Installation page of Getting Started in PySpark documentation

2020-09-07 Thread GitBox
HyukjinKwon commented on a change in pull request #29640: URL: https://github.com/apache/spark/pull/29640#discussion_r484631009 ## File path: python/docs/source/getting_started/installation.rst ## @@ -0,0 +1,119 @@ +.. Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [spark] cloud-fan commented on a change in pull request #29667: [SPARK-32813][SQL] Get default config of vectorized reader if no active SparkSession

2020-09-07 Thread GitBox
cloud-fan commented on a change in pull request #29667: URL: https://github.com/apache/spark/pull/29667#discussion_r484630405 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/SQLExecutionSuite.scala ## @@ -119,6 +125,39 @@ class SQLExecutionSuite extends Sp

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29640: [SPARK-32180][PYTHON][DOCS] Installation page of Getting Started in PySpark documentation

2020-09-07 Thread GitBox
HyukjinKwon commented on a change in pull request #29640: URL: https://github.com/apache/spark/pull/29640#discussion_r484630365 ## File path: python/docs/source/getting_started/installation.rst ## @@ -0,0 +1,120 @@ +.. Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29667: [SPARK-32813][SQL] Get default config of vectorized reader if no active SparkSession

2020-09-07 Thread GitBox
AmplabJenkins removed a comment on pull request #29667: URL: https://github.com/apache/spark/pull/29667#issuecomment-688596276 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] cloud-fan commented on pull request #29645: [SPARK-32796][SQL] Make withField API support nested struct in array

2020-09-07 Thread GitBox
cloud-fan commented on pull request #29645: URL: https://github.com/apache/spark/pull/29645#issuecomment-688596465 I mean we should prefer "clear and simple semantic", otherwise people can always ask to be more flexible and save more code, like supporting array of array of struct. --

[GitHub] [spark] AmplabJenkins commented on pull request #29667: [SPARK-32813][SQL] Get default config of vectorized reader if no active SparkSession

2020-09-07 Thread GitBox
AmplabJenkins commented on pull request #29667: URL: https://github.com/apache/spark/pull/29667#issuecomment-688596276 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] cloud-fan commented on a change in pull request #29605: [SPARK-31511][SQL][2.4] Make BytesToBytesMap iterators thread-safe

2020-09-07 Thread GitBox
cloud-fan commented on a change in pull request #29605: URL: https://github.com/apache/spark/pull/29605#discussion_r484629408 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/joins/HashedRelationSuite.scala ## @@ -341,6 +341,45 @@ class HashedRelationSuite

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29667: [SPARK-32813][SQL] Get default config of vectorized reader if no active SparkSession

2020-09-07 Thread GitBox
HyukjinKwon commented on a change in pull request #29667: URL: https://github.com/apache/spark/pull/29667#discussion_r484629291 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/SQLExecutionSuite.scala ## @@ -119,6 +125,37 @@ class SQLExecutionSuite extends

[GitHub] [spark] cloud-fan closed pull request #29605: [SPARK-31511][SQL][2.4] Make BytesToBytesMap iterators thread-safe

2020-09-07 Thread GitBox
cloud-fan closed pull request #29605: URL: https://github.com/apache/spark/pull/29605 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [spark] Ted-Jiang commented on pull request #29668: [MINOR][SQL] Fix a typo at 'spark.sql.sources.fileCompressionFactor' error message in SQLConf

2020-09-07 Thread GitBox
Ted-Jiang commented on pull request #29668: URL: https://github.com/apache/spark/pull/29668#issuecomment-688595580 > @Ted-Jiang, can you double check and see if there are other typos in this file while we're here? Sureļ¼ --

[GitHub] [spark] cloud-fan commented on pull request #29605: [SPARK-31511][SQL][2.4] Make BytesToBytesMap iterators thread-safe

2020-09-07 Thread GitBox
cloud-fan commented on pull request #29605: URL: https://github.com/apache/spark/pull/29605#issuecomment-688595687 jenkins is happy finally ... Thanks, merging to 2.4! This is an automated message from the Apache Git S

[GitHub] [spark] viirya commented on a change in pull request #29667: [SPARK-32813][SQL] Get default config of vectorized reader if no active SparkSession

2020-09-07 Thread GitBox
viirya commented on a change in pull request #29667: URL: https://github.com/apache/spark/pull/29667#discussion_r484629064 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/SQLExecutionSuite.scala ## @@ -119,6 +125,37 @@ class SQLExecutionSuite extends Spark

[GitHub] [spark] HyukjinKwon commented on pull request #29666: [SPARK-32812][PYTHON][TESTS] Avoid initiating a process during the main process for run-tests.py

2020-09-07 Thread GitBox
HyukjinKwon commented on pull request #29666: URL: https://github.com/apache/spark/pull/29666#issuecomment-688595293 Merged to master, branch-3.0 and branch-2.4. This is an automated message from the Apache Git Service. To re

[GitHub] [spark] HyukjinKwon closed pull request #29666: [SPARK-32812][PYTHON][TESTS] Avoid initiating a process during the main process for run-tests.py

2020-09-07 Thread GitBox
HyukjinKwon closed pull request #29666: URL: https://github.com/apache/spark/pull/29666 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] viirya commented on a change in pull request #29667: [SPARK-32813][SQL] Get default config of vectorized reader if no active SparkSession

2020-09-07 Thread GitBox
viirya commented on a change in pull request #29667: URL: https://github.com/apache/spark/pull/29667#discussion_r484628548 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/SQLExecutionSuite.scala ## @@ -119,6 +125,37 @@ class SQLExecutionSuite extends Spark

[GitHub] [spark] HyukjinKwon commented on pull request #29668: [MINOR][SQL] Fix a typo at 'spark.sql.sources.fileCompressionFactor' error message in SQLConf

2020-09-07 Thread GitBox
HyukjinKwon commented on pull request #29668: URL: https://github.com/apache/spark/pull/29668#issuecomment-688594956 @Ted-Jiang, can you double check and see if there are other typos in this file while we're here? This is an

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29667: [SPARK-32813][SQL] Get default config of vectorized reader if no active SparkSession

2020-09-07 Thread GitBox
HyukjinKwon commented on a change in pull request #29667: URL: https://github.com/apache/spark/pull/29667#discussion_r484625824 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/SQLExecutionSuite.scala ## @@ -119,6 +125,37 @@ class SQLExecutionSuite extends

[GitHub] [spark] HyukjinKwon commented on pull request #29667: [SPARK-32813][SQL] Get default config of vectorized reader if no active SparkSession

2020-09-07 Thread GitBox
HyukjinKwon commented on pull request #29667: URL: https://github.com/apache/spark/pull/29667#issuecomment-688594453 The change itself looks good. This is an automated message from the Apache Git Service. To respond to the me

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29667: [SPARK-32813][SQL] Get default config of vectorized reader if no active SparkSession

2020-09-07 Thread GitBox
HyukjinKwon commented on a change in pull request #29667: URL: https://github.com/apache/spark/pull/29667#discussion_r484627138 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/SQLExecutionSuite.scala ## @@ -119,6 +125,37 @@ class SQLExecutionSuite extends

  1   2   3   4   5   >