Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16867
@kayousterhout @squito @mridulm
Thanks for reviewing this !
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project doe
Github user kayousterhout commented on the issue:
https://github.com/apache/spark/pull/16867
Merged this to master -- thanks for all of the quick updates here
@jinxing64!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. I
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16867
@kayousterhout
Thanks a lot for comments. I refined accordingly :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your proj
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75137/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #75137 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75137/testReport)**
for PR 16867 at commit
[`b9bdf44`](https://github.com/apache/spark/commit/b
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #75137 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75137/testReport)**
for PR 16867 at commit
[`b9bdf44`](https://github.com/apache/spark/commit/b9
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16867
@kayousterhout more comments?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74870/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74870 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74870/testReport)**
for PR 16867 at commit
[`5192a32`](https://github.com/apache/spark/commit/5
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74870 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74870/testReport)**
for PR 16867 at commit
[`5192a32`](https://github.com/apache/spark/commit/51
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74858/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74858 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74858/testReport)**
for PR 16867 at commit
[`36c205a`](https://github.com/apache/spark/commit/36
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16867
@squito
Thanks :) already refined.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16867
@mridulm
Thanks a lot for helping review this : ) really appreciate.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your pr
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/16867
LGTM @kayousterhout , @squito.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16867
@squito
Sure. I did test for 100k tasks. The results are as below:
| | time cost |
| --| -- |
| insert | 135ms, 122ms, 119ms, 120ms, 163ms |
| `checkSpeculatableTa
Github user squito commented on the issue:
https://github.com/apache/spark/pull/16867
@jinxing64 would you mind repeating your performance experiments with the
lastest version? Both for `checkSpeculatableTasks` and also for inserting the
duration on each task completion?
---
If you
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74675/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74675 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74675/testReport)**
for PR 16867 at commit
[`617d5aa`](https://github.com/apache/spark/commit/6
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74674/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74674 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74674/testReport)**
for PR 16867 at commit
[`c13a198`](https://github.com/apache/spark/commit/c
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74673/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74673 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74673/testReport)**
for PR 16867 at commit
[`7740d77`](https://github.com/apache/spark/commit/7
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74675 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74675/testReport)**
for PR 16867 at commit
[`617d5aa`](https://github.com/apache/spark/commit/61
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74674 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74674/testReport)**
for PR 16867 at commit
[`c13a198`](https://github.com/apache/spark/commit/c1
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74673 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74673/testReport)**
for PR 16867 at commit
[`7740d77`](https://github.com/apache/spark/commit/77
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74649/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74649 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74649/testReport)**
for PR 16867 at commit
[`2518a95`](https://github.com/apache/spark/commit/2
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74649 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74649/testReport)**
for PR 16867 at commit
[`2518a95`](https://github.com/apache/spark/commit/25
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74640/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74640 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74640/testReport)**
for PR 16867 at commit
[`104e867`](https://github.com/apache/spark/commit/1
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16867
@kayousterhout
Thanks a lot for the comments :) very helpful.
I've refined, please take another look when you have time.
---
If your project is set up for it, you can reply to this email
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74640 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74640/testReport)**
for PR 16867 at commit
[`104e867`](https://github.com/apache/spark/commit/10
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16867
@kayousterhout @mridulm
More comments on this ? :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not hav
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16867
@squito
Thanks a lot for comments. I've refined :):)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74476/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74476 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74476/testReport)**
for PR 16867 at commit
[`5aa2fcf`](https://github.com/apache/spark/commit/5
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74475/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74475 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74475/testReport)**
for PR 16867 at commit
[`318a172`](https://github.com/apache/spark/commit/3
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74476 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74476/testReport)**
for PR 16867 at commit
[`5aa2fcf`](https://github.com/apache/spark/commit/5a
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74475 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74475/testReport)**
for PR 16867 at commit
[`318a172`](https://github.com/apache/spark/commit/31
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74359/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74359 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74359/testReport)**
for PR 16867 at commit
[`09719a2`](https://github.com/apache/spark/commit/0
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74359 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74359/testReport)**
for PR 16867 at commit
[`09719a2`](https://github.com/apache/spark/commit/09
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16867
@squito Sorry, it seems like something went wrong when I did merge and try
resolve the conflict. I squashed the commits and did rebase. It seems ok now.
---
If your project is set up for it, you
Github user squito commented on the issue:
https://github.com/apache/spark/pull/16867
@jinxing64 looks like something went wrong with your last push, I think
there are lots of unintentional changes
---
If your project is set up for it, you can reply to this email and have your
reply
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74322/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74322 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74322/testReport)**
for PR 16867 at commit
[`205f52f`](https://github.com/apache/spark/commit/2
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74322 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74322/testReport)**
for PR 16867 at commit
[`205f52f`](https://github.com/apache/spark/commit/20
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16867
@kayousterhout @squito @mridulm
I refined according comments. Please take a look when you have time :)
---
If your project is set up for it, you can reply to this email and have your
reply ap
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16867
Thanks a lot for comments. I refined accordingly : )
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have t
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74070/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74070 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74070/testReport)**
for PR 16867 at commit
[`ee4c486`](https://github.com/apache/spark/commit/e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74069/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74069 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74069/testReport)**
for PR 16867 at commit
[`a2381b6`](https://github.com/apache/spark/commit/a
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74070 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74070/testReport)**
for PR 16867 at commit
[`ee4c486`](https://github.com/apache/spark/commit/ee
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #74069 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74069/testReport)**
for PR 16867 at commit
[`a2381b6`](https://github.com/apache/spark/commit/a2
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/16867
You are right, the strict definition requires us to average - it just makes
it difficult to reason based on logs at times when the duration mentioned
does not exist when there are discontinuit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73985/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #73985 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73985/testReport)**
for PR 16867 at commit
[`82aa3ed`](https://github.com/apache/spark/commit/8
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16867
@mridulm
Thanks a lot for comments. I refined accordingly. (btw time complexity of
the `rebalance` in `MedianHeap`is O(1)).
---
If your project is set up for it, you can reply to this email
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #73985 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73985/testReport)**
for PR 16867 at commit
[`82aa3ed`](https://github.com/apache/spark/commit/82
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73968/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #73968 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73968/testReport)**
for PR 16867 at commit
[`f30ef46`](https://github.com/apache/spark/commit/f
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73967/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #73967 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73967/testReport)**
for PR 16867 at commit
[`712e06e`](https://github.com/apache/spark/commit/7
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/16867
The cost for median heap could be higher than TreeMap imo - for example,
the additional dequeue + enqueue when rebalance is required ?
If the cost is high enough, we might want to relook at the P
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16867
@mridulm
Thanks a lot for your comments. I did a test with `TreeSet` previously with
100k tasks. I calculate the time spent on insertion. The results are: 372ms,
362ms, 458ms, 429ms, 363ms,
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73955/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #73955 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73955/testReport)**
for PR 16867 at commit
[`1fac678`](https://github.com/apache/spark/commit/1
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #73968 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73968/testReport)**
for PR 16867 at commit
[`f30ef46`](https://github.com/apache/spark/commit/f3
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #73967 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73967/testReport)**
for PR 16867 at commit
[`712e06e`](https://github.com/apache/spark/commit/71
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #73955 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73955/testReport)**
for PR 16867 at commit
[`1fac678`](https://github.com/apache/spark/commit/1f
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73932/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16867
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #73932 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73932/testReport)**
for PR 16867 at commit
[`f5fb0b9`](https://github.com/apache/spark/commit/f
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16867
**[Test build #73932 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73932/testReport)**
for PR 16867 at commit
[`f5fb0b9`](https://github.com/apache/spark/commit/f5
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16867
@squito
Yes, some of machine learning jobs which do cartesian product in my cluster
have over than 100k tasks in the `TaskSetManager`.
---
If your project is set up for it, you can reply t
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16867
@kayousterhout @squito
Thanks a lot for your comments, really helpful :)
I really think median heap is a good idea. `slice` is `O(n)` and is not
most efficient.
I'm doing implementati
Github user squito commented on the issue:
https://github.com/apache/spark/pull/16867
more brainstorming:
(1) you could lazily update your median collection (whether its a treeset
or median heap). First you'd just dump tasks into an array, and then when you
query for the med
Github user squito commented on the issue:
https://github.com/apache/spark/pull/16867
median heap is a good idea to try. in fact, `slice` is `O(n)` because of
the way its implemented, it actually iterates through the first `n/2` elements
(even though it should be able to do something
Github user kayousterhout commented on the issue:
https://github.com/apache/spark/pull/16867
Also, thanks for doing the timing measurements!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have th
Github user kayousterhout commented on the issue:
https://github.com/apache/spark/pull/16867
I'm a little on the fence about this because of the added complexity, but
it does seem to be a significant time improvement. Did you consider
implementing this as a median heap (see the last
1 - 100 of 139 matches
Mail list logo