Github user sujith71955 commented on the issue:
https://github.com/apache/spark/pull/16677
Yes sure , i will create a ticket for this issue and Keep you guys in
loop. Thanks
---
-
To unsubscribe, e-mail:
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16677
@sujith71955 Thanks. I see. The case is somehow different with the problem
this PR wants to solve. But I think it is a reasonable use case. May you want
to create a ticket for us to track it?
---
Github user sujith71955 commented on the issue:
https://github.com/apache/spark/pull/16677
Mainly i think we are trying to interpolate the number of partitions
---
-
To unsubscribe, e-mail:
Github user sujith71955 commented on the issue:
https://github.com/apache/spark/pull/16677
@viirya I am having a usecase where a normal query is taking around 5
seconds where same query with limit 5000 is taking around 17 sec. when i was
checking i could find bottleneck in the
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16677
@sujith71955 For `executeTake`, to optimize it we need to collect
statistics of RDD. `executeTake` incrementally scans partitions. Ideally, it
should just scan few partitions to return `n` rows, and
Github user sujith71955 commented on the issue:
https://github.com/apache/spark/pull/16677
@viirya Are we also looking to optimize CollectLimitExec part? I saw in
SparkPlan we have an executeTake() method which basically interpolate the
number of partitions and processes the limit
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16677
I understood the two major concerns regarding this change. I'm going to
submit a pr to revert the change. I will look into this idea further with new
design.
---
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/16677
I'm convinced, there are 2 major issues:
1. abusing shuffle. we need a new mechanism for driver to analyze some
statistics about data (records per map task)
2. too many small tasks. We
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/16677
ok after thinking about it more, i think we should just revert all of these
changes and go back to the drawing board. here's why:
1. the prs change some of the most common/core parts of spark,
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/16677
Let me take an example from the PR description
> For example, we have three partitions with rows (100, 100, 50)
respectively. In global limit of 100 rows, we may take (34, 33, 33) rows for
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16677
I'm not sure where it can cause perf regressions. Basically this just
changes the way we retrieve records from partitions when performing limit. This
doesn't do shuffling them together to single
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/16677
actually looking at the design - this could cause perf regressions in some
cases too right? it introduces a barrier that was previously non-existent. if
the number of records to take isn't
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16677
@rxin Thanks for the comment. I will improve the document in a pr.
---
-
To unsubscribe, e-mail:
Github user hvanhovell commented on the issue:
https://github.com/apache/spark/pull/16677
1. `numOutputs` is the number or records
2. 8 bytes per `MapStatus`.
---
-
To unsubscribe, e-mail:
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/16677
two questions about this (i just saw this from a different place):
1. is numOutput about number of records?
2. how much memory usage will be increased by, for the driver, at scale?
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16677
Thank you! @hvanhovell
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user hvanhovell commented on the issue:
https://github.com/apache/spark/pull/16677
Merging to master. Thanks!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94452/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #94452 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94452/testReport)**
for PR 16677 at commit
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16677
I set up a test PR for `VersionsSuite` at #22046.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #94452 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94452/testReport)**
for PR 16677 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16677
retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16677
I don't run into this test failure in `VersionsSuite` locally.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94430/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #94430 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94430/testReport)**
for PR 16677 at commit
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16677
Looks like unrelated test failure at `VersionsSuite`...
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #94430 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94430/testReport)**
for PR 16677 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16677
retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #94415 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94415/testReport)**
for PR 16677 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94415/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #94415 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94415/testReport)**
for PR 16677 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user hvanhovell commented on the issue:
https://github.com/apache/spark/pull/16677
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94227/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #94227 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94227/testReport)**
for PR 16677 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #94227 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94227/testReport)**
for PR 16677 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16677
retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94220/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #94220 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94220/testReport)**
for PR 16677 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #94220 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94220/testReport)**
for PR 16677 at commit
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16677
retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94216/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #94216 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94216/testReport)**
for PR 16677 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #94216 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94216/testReport)**
for PR 16677 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16677
retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94195/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #94195 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94195/testReport)**
for PR 16677 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #94195 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94195/testReport)**
for PR 16677 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16677
@hvanhovell Shall we consider to include this into 2.4?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93789/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #93789 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93789/testReport)**
for PR 16677 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #93789 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93789/testReport)**
for PR 16677 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16677
retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #93771 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93771/testReport)**
for PR 16677 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93771/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #93771 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93771/testReport)**
for PR 16677 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16677
retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93502/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #93502 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93502/testReport)**
for PR 16677 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #93502 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93502/testReport)**
for PR 16677 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16677
retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93486/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #93486 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93486/testReport)**
for PR 16677 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #93486 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93486/testReport)**
for PR 16677 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user dilipbiswal commented on the issue:
https://github.com/apache/spark/pull/16677
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16677
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93478/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #93478 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93478/testReport)**
for PR 16677 at commit
1 - 100 of 265 matches
Mail list logo