Github user someorz closed the pull request at:
https://github.com/apache/spark/pull/16641
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
GitHub user someorz opened a pull request:
https://github.com/apache/spark/pull/16641
Merge pull request #1 from apache/master
update
## What changes were proposed in this pull request?
(Please fill in changes proposed in this fix)
## How was this patch
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16344
**[Test build #71643 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71643/testReport)**
for PR 16344 at commit
Github user someorz closed the pull request at:
https://github.com/apache/spark/pull/16640
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user someorz commented on the issue:
https://github.com/apache/spark/pull/16640
update
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
GitHub user someorz opened a pull request:
https://github.com/apache/spark/pull/16640
Merge pull request #1 from apache/master
update
## What changes were proposed in this pull request?
(Please fill in changes proposed in this fix)
## How was this patch
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/16593
@windpiger Could you do me a favor to add a dedicated test case in this PR?
- Create a partitinoed Hive Table
- Create a partitinoed data source Table
- Create a partitinoed Hive
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/16344
jenkins test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/16344
jenkins add to whitelist
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/16593#discussion_r96805975
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala
---
@@ -87,8 +101,8 @@ case class
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/16593#discussion_r96805893
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala
---
@@ -64,7 +77,7 @@ case class
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15415
**[Test build #71642 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71642/testReport)**
for PR 15415 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/12064
**[Test build #71641 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71641/testReport)**
for PR 12064 at commit
Github user hhbyyh commented on a diff in the pull request:
https://github.com/apache/spark/pull/15415#discussion_r96804046
--- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala ---
@@ -0,0 +1,232 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
Github user hhbyyh commented on a diff in the pull request:
https://github.com/apache/spark/pull/15415#discussion_r96803812
--- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala ---
@@ -0,0 +1,232 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16639
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16639
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71640/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16639
**[Test build #71640 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71640/testReport)**
for PR 16639 at commit
Github user scwf commented on the issue:
https://github.com/apache/spark/pull/16633
need define a new map output statistics to do this
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16633
@scwf I don't think it would work. map output statistics is just
approximate number of output bytes. You can't use it to get correct row number.
---
If your project is set up for it, you can reply
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16639
**[Test build #71638 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71638/testReport)**
for PR 16639 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16639
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71638/
Test FAILed.
---
Github user hhbyyh commented on a diff in the pull request:
https://github.com/apache/spark/pull/15415#discussion_r96802011
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/fpm/AssociationRules.scala ---
@@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16639
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16639
**[Test build #71640 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71640/testReport)**
for PR 16639 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16639
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16639
**[Test build #71637 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71637/testReport)**
for PR 16639 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16639
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71637/
Test FAILed.
---
Github user squito commented on the issue:
https://github.com/apache/spark/pull/16639
cc @kayousterhout @markhamstra @mateiz
This isn't just protecting against crazy user code -- I've seen users hit
this with spark sql (because of
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/16621#discussion_r96800976
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
---
@@ -586,12 +594,12 @@ class SessionCatalog(
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16552
**[Test build #71639 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71639/testReport)**
for PR 16552 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16639
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16639
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71636/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16639
**[Test build #71638 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71638/testReport)**
for PR 16639 at commit
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/16621#discussion_r96800456
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
---
@@ -586,12 +594,12 @@ class SessionCatalog(
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16581
**[Test build #3541 has
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3541/testReport)**
for PR 16581 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16639
**[Test build #71637 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71637/testReport)**
for PR 16639 at commit
Github user scwf commented on the issue:
https://github.com/apache/spark/pull/16633
Yes, you are right, we can not ensure the uniform distribution for global
limit.
An idea is not use a special partitioner, after the shuffle we should get
the mapoutput statistics for row num of
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16639
**[Test build #71636 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71636/testReport)**
for PR 16639 at commit
GitHub user squito opened a pull request:
https://github.com/apache/spark/pull/16639
[SPARK-19276][CORE] Fetch Failure handling robust to user error handling
## What changes were proposed in this pull request?
Fault-tolerance in spark requires special handling of shuffle
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16633
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71633/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16633
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16633
**[Test build #71633 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71633/testReport)**
for PR 16633 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16635
**[Test build #71635 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71635/testReport)**
for PR 16635 at commit
Github user jayadevanmurali commented on the issue:
https://github.com/apache/spark/pull/16635
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user jayadevanmurali commented on the issue:
https://github.com/apache/spark/pull/16635
@cloud-fan
Incorporated code review comments
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/16605
okay. But, if this issue finished, I'm planning to take SPARK-12823 in a
similar way.
Do u think also it's not also worth trying struct? cc: @cloud-fan
---
If your project is set up for it,
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16605
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71631/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16605
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16605
**[Test build #71631 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71631/testReport)**
for PR 16605 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16635
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16635
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71630/
Test PASSed.
---
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/16605
Well, it will be good if we can support `Array` in `ScalaUDF`, but it's not
a big deal as users can easily do `udf { (seq: Seq[Int]) => val a =
seq.toArray; // do anything you like with the array
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16635
**[Test build #71630 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71630/testReport)**
for PR 16635 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16593
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16593
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71634/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16593
**[Test build #71634 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71634/testReport)**
for PR 16593 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16593
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71632/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16593
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16593
**[Test build #71632 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71632/testReport)**
for PR 16593 at commit
Github user gatorsmile closed the pull request at:
https://github.com/apache/spark/pull/16634
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/16634
Thanks for the review, Merging to 2.0!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/16605
Sure, @maropu . `WrappedArray` is not documented for now.
Hi, @gatorsmile and @cloud-fan .
Could you review this PR?
---
If your project is set up for it, you can reply to this
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16633
@scwf No. A simple example: if there are 5 local limit which produce 1, 2,
1, 1, 1 rows when limit is 10. If you shuffle to 5 partitions, the
distributions for each local limit look like:
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/16213
@tdas ping
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user jayadevanmurali commented on a diff in the pull request:
https://github.com/apache/spark/pull/16635#discussion_r96790387
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
---
@@ -2513,4 +2513,18 @@ class SQLQuerySuite extends QueryTest with
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/16605
oh, yea. I didn't find that and I think it's a good point.
IMO `WrappedArray` is implicitly used inside for implicit conversions, so
users do not use `WrappedArray` directly for UDFs in most
Github user jayadevanmurali commented on a diff in the pull request:
https://github.com/apache/spark/pull/16635#discussion_r96790264
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
---
@@ -2513,4 +2513,18 @@ class SQLQuerySuite extends QueryTest with
Github user jayadevanmurali commented on a diff in the pull request:
https://github.com/apache/spark/pull/16635#discussion_r96790238
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
---
@@ -2513,4 +2513,18 @@ class SQLQuerySuite extends QueryTest with
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/16635#discussion_r96790019
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
---
@@ -2513,4 +2513,18 @@ class SQLQuerySuite extends QueryTest with
Github user scwf commented on the issue:
https://github.com/apache/spark/pull/16633
refer to the maillist
>One issue left is how to decide shuffle partition number.
We can have a config of the maximum number of elements for each GlobalLimit
task to process,
then do a
Github user maropu commented on a diff in the pull request:
https://github.com/apache/spark/pull/16605#discussion_r96789868
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala
---
@@ -84,7 +86,9 @@ case class ScalaUDF(
case 1 =>
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/16635#discussion_r96789821
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
---
@@ -2513,4 +2513,18 @@ class SQLQuerySuite extends QueryTest with
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/16635#discussion_r96789777
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
---
@@ -2513,4 +2513,18 @@ class SQLQuerySuite extends QueryTest with
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16633
**[Test build #71633 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71633/testReport)**
for PR 16633 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16593
**[Test build #71634 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71634/testReport)**
for PR 16593 at commit
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16633
@scwf
> it use a special partitioner to do this, the partitioner like the
row_numer in sql it give each row a uniform partitionid, so in the reduce task,
each task handle num of rows very
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/16593#discussion_r96788653
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala
---
@@ -87,8 +101,8 @@ case class
Github user scwf commented on the issue:
https://github.com/apache/spark/pull/16633
To clear, now we have these issues:
1. local limit compute all partitions, that means it launch many tasks
but actually maybe very small tasks is enough.
2. global limit single partition
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16593
**[Test build #71632 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71632/testReport)**
for PR 16593 at commit
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/16593#discussion_r96788538
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala
---
@@ -1361,6 +1355,22 @@ class HiveDDLSuite
}
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16633
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71627/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16633
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16633
**[Test build #71627 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71627/testReport)**
for PR 16633 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16605
**[Test build #71631 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71631/testReport)**
for PR 16605 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16635
**[Test build #71630 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71630/testReport)**
for PR 16635 at commit
Github user jayadevanmurali commented on the issue:
https://github.com/apache/spark/pull/16635
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16635
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71629/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16635
**[Test build #71629 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71629/testReport)**
for PR 16635 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16635
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16635
**[Test build #71629 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71629/testReport)**
for PR 16635 at commit
Github user jayadevanmurali commented on the issue:
https://github.com/apache/spark/pull/16635
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/16552
The overall idea is to use `InsertIntable` to implement appending to hive
table, but this approach is too hacky, we should follow the way how we deal
with data source table, e.g.
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/16633#discussion_r96786080
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala ---
@@ -90,21 +94,74 @@ trait BaseLimitExec extends UnaryExecNode with
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/16621#discussion_r96785980
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
---
@@ -118,6 +118,14 @@ class SessionCatalog(
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16633
@scwf The main issue the user posted in the mailing list is, the limit is
big enough or partition number is big enough to cause performance bottleneck in
shuffling the data of local limit. But
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16638
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
GitHub user ouyangxiaochen opened a pull request:
https://github.com/apache/spark/pull/16638
spark-19115
## What changes were proposed in this pull request?
sparksql supports the command : create external table if not exists gen_tbl
like src_tbl location '/warehouse/gen_tbl' in
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/16621#discussion_r96785426
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
---
@@ -118,6 +118,14 @@ class SessionCatalog(
}
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/16627#discussion_r96779672
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStore.scala
---
@@ -34,6 +35,132 @@ import
1 - 100 of 538 matches
Mail list logo