On 08.03.2016 08:12, srinivas devaki wrote:
Hi,
sorry i didn't get to your last mail as i'm having exams at that time.

No problem. :) I hope they went all well for you.

Everything is good so far.

but if at all the purpose is to add features to the stdlib heap and
keeping the speed offered by the stdlib, why aren't you allowing the
duplicate elements which is possible with stdlib heap.

Don't let the old boy have time to rest, he? ;-)

But that's good. We all need a kick in the ass from time to time in order to get thing running. :-)

to be able to add duplicate elements you just have to make the set as
a dict with counter as the value. afterall it is highly common to have
priority task scheduling with equal priorities to the tasks.

as far as the benchmarks are concerned it didn't change much with the
current dataset. but it is not valid to compare them as the current
dataset is intended for unique elements. but with the dict the dataset
can contain elements which have equal keys.

So, my initial design idea was to wrap these items up in their own wrapper and thus avoiding the duplicate issue. I discussed this with Cem here http://code.activestate.com/lists/python-list/697567/ :)

on my machine with 10 repetitions in the test_xheap_time.py
** results for the code which supports duplicate elements
h...@github.com/eightnoteight/xheap **

(u'init', u'heapq      ', u' 0.45 ( 1.00x)  4.59 ( 1.00x) 52.24 (
1.00x) 553.80 ( 1.00x)')
(u'    ', u'Heap       ', u' 0.48 ( 1.07x)  5.41 ( 1.18x) 75.89 (
1.45x) 786.46 ( 1.42x)')
(u'    ', u'RemovalHeap', u' 0.86 ( 1.93x)  9.98 ( 2.18x) 131.28 (
2.51x) 1364.14 ( 2.46x)')
--------------------------------------------------------------------
(u'pop ', u'heapq      ', u' 0.09 ( 1.00x)  1.27 ( 1.00x) 15.64 (
1.00x) 184.76 ( 1.00x)')
(u'    ', u'Heap       ', u' 0.10 ( 1.07x)  1.33 ( 1.05x) 16.30 (
1.04x) 191.43 ( 1.04x)')
(u'    ', u'RemovalHeap', u' 0.15 ( 1.64x)  1.87 ( 1.47x) 21.33 (
1.36x) 242.31 ( 1.31x)')
--------------------------------------------------------------------
(u'push', u'heapq      ', u' 0.04 ( 1.00x)  0.33 ( 1.00x)  3.53 (
1.00x) 32.99 ( 1.00x)')
(u'    ', u'Heap       ', u' 0.05 ( 1.20x)  0.41 ( 1.25x)  4.37 (
1.24x) 44.14 ( 1.34x)')
(u'    ', u'RemovalHeap', u' 0.06 ( 1.58x)  0.56 ( 1.70x)  5.85 (
1.66x) 59.59 ( 1.81x)')
--------------------------------------------------------------------
--------------------------------------------------------------------
(u'init', u'heapq    ', u' 0.45 ( 1.00x)  8.00 ( 1.00x) 109.50 (
1.00x) 898.50 ( 1.00x)')
(u'    ', u'OrderHeap', u' 0.50 ( 1.10x)  8.75 ( 1.09x) 112.63 (
1.03x) 1108.26 ( 1.23x)')
(u'    ', u'XHeap    ', u' 0.93 ( 2.06x) 13.88 ( 1.74x) 170.97 (
1.56x) 1709.94 ( 1.90x)')
--------------------------------------------------------------------
(u'pop ', u'heapq    ', u' 0.04 ( 1.00x)  0.54 ( 1.00x)  6.50 ( 1.00x)
76.64 ( 1.00x)')
(u'    ', u'OrderHeap', u' 0.06 ( 1.73x)  0.80 ( 1.49x)  9.16 ( 1.41x)
101.68 ( 1.33x)')
(u'    ', u'XHeap    ', u' 0.10 ( 2.65x)  1.14 ( 2.13x) 12.60 ( 1.94x)
137.73 ( 1.80x)')
--------------------------------------------------------------------
(u'push', u'heapq    ', u' 0.01 ( 1.00x)  0.17 ( 1.00x)  1.86 ( 1.00x)
14.85 ( 1.00x)')
(u'    ', u'OrderHeap', u' 0.04 ( 3.88x)  0.47 ( 2.86x)  4.80 ( 2.58x)
42.98 ( 2.89x)')
(u'    ', u'XHeap    ', u' 0.04 ( 3.79x)  0.47 ( 2.85x)  4.85 ( 2.61x)
43.94 ( 2.96x)')
--------------------------------------------------------------------
--------------------------------------------------------------------
(u'remove', u'RemovalHeap', u' 0.07 ( 1.00x)  0.73 ( 1.00x)  7.38 (
1.00x) 75.18 ( 1.00x)')
(u'      ', u'XHeap      ', u' 0.05 ( 0.79x)  0.55 ( 0.75x)  5.52 (
0.75x) 55.08 ( 0.73x)')
--------------------------------------------------------------------
--------------------------------------------------------------------


** benchmark for current git HEAD d0707fba2401a3cef8aed54028fe6d7e9497faa5 **


(u'init', u'heapq      ', u' 0.49 ( 1.00x)  5.03 ( 1.00x) 58.71 (
1.00x) 600.91 ( 1.00x)')
(u'    ', u'Heap       ', u' 0.51 ( 1.05x)  5.83 ( 1.16x) 82.88 (
1.41x) 886.22 ( 1.47x)')
(u'    ', u'RemovalHeap', u' 0.58 ( 1.19x)  7.19 ( 1.43x) 106.80 (
1.82x) 1108.52 ( 1.84x)')
--------------------------------------------------------------------
(u'pop ', u'heapq      ', u' 0.10 ( 1.00x)  1.41 ( 1.00x) 17.64 (
1.00x) 209.82 ( 1.00x)')
(u'    ', u'Heap       ', u' 0.11 ( 1.07x)  1.47 ( 1.04x) 18.27 (
1.04x) 215.14 ( 1.03x)')
(u'    ', u'RemovalHeap', u' 0.15 ( 1.52x)  1.91 ( 1.35x) 22.64 (
1.28x) 258.68 ( 1.23x)')
--------------------------------------------------------------------
(u'push', u'heapq      ', u' 0.04 ( 1.00x)  0.32 ( 1.00x)  3.49 (
1.00x) 33.92 ( 1.00x)')
(u'    ', u'Heap       ', u' 0.05 ( 1.18x)  0.39 ( 1.22x)  4.21 (
1.20x) 42.03 ( 1.24x)')
(u'    ', u'RemovalHeap', u' 0.06 ( 1.52x)  0.52 ( 1.62x)  5.60 (
1.60x) 56.54 ( 1.67x)')
--------------------------------------------------------------------
--------------------------------------------------------------------
(u'init', u'heapq    ', u' 0.44 ( 1.00x)  7.92 ( 1.00x) 106.52 (
1.00x) 915.20 ( 1.00x)')
(u'    ', u'OrderHeap', u' 0.50 ( 1.14x)  8.67 ( 1.10x) 111.99 (
1.05x) 1129.89 ( 1.23x)')
(u'    ', u'XHeap    ', u' 0.61 ( 1.38x) 10.75 ( 1.36x) 140.86 (
1.32x) 1417.84 ( 1.55x)')
--------------------------------------------------------------------
(u'pop ', u'heapq    ', u' 0.04 ( 1.00x)  0.55 ( 1.00x)  6.59 ( 1.00x)
76.81 ( 1.00x)')
(u'    ', u'OrderHeap', u' 0.06 ( 1.68x)  0.79 ( 1.43x)  9.04 ( 1.37x)
101.72 ( 1.32x)')
(u'    ', u'XHeap    ', u' 0.09 ( 2.43x)  1.03 ( 1.85x) 11.48 ( 1.74x)
125.94 ( 1.64x)')
--------------------------------------------------------------------
(u'push', u'heapq    ', u' 0.01 ( 1.00x)  0.16 ( 1.00x)  1.85 ( 1.00x)
14.65 ( 1.00x)')
(u'    ', u'OrderHeap', u' 0.04 ( 4.32x)  0.46 ( 2.81x)  4.74 ( 2.56x)
42.25 ( 2.88x)')
(u'    ', u'XHeap    ', u' 0.03 ( 3.73x)  0.42 ( 2.55x)  4.34 ( 2.35x)
38.37 ( 2.62x)')
--------------------------------------------------------------------
--------------------------------------------------------------------
(u'remove', u'RemovalHeap', u' 0.05 ( 1.00x)  0.54 ( 1.00x)  5.62 (
1.00x) 57.04 ( 1.00x)')
(u'      ', u'XHeap      ', u' 0.04 ( 0.88x)  0.44 ( 0.80x)  4.41 (
0.78x) 43.86 ( 0.77x)')
--------------------------------------------------------------------
--------------------------------------------------------------------


So as the results are not much effected apart of __init__, i think you
should consider this.

Looks promising. I will look into it.

Note: i'm not using collections.Counter because it is written in
python, and from my previous experience it is slower than using
defaultdict for this kind of purposes.

That's exactly why the __setitem__ implemenation was so slow. It did not use the C implemention. I think I could reduce the overhead even further by having my own global/thread-local integer counter. Stay tuned. ;-)

ps: there are two error's when i ran tests with test_xheap.

Damn. I see this is Python 2 and Python 3 related. Thanks for bringing this to my attention. I am going to fix this soon.

OMG why
did you keep repititions = 10000, at first i though my pentium laptop
is too slow that it is not printing a single line even after 20
minutes. then i saw the number of computations are in the order of
10**10, how many days did it took to completely run the tests?

I don't know. I let it run overnight since I wasn't at home. :)

But you are right. I re-executed the benchmark and compared 100, 1000 and 10000 with each other. Almost no difference at all.

I am going to reduce it to 100. So, it takes ca. 8 minutes on my machine.

Thanks for your feedback,
Sven



On Sun, Mar 6, 2016 at 7:29 PM, Sven R. Kunze <srku...@mail.de> wrote:
Hi python-list, hi Srinivas,

I managed to implement the mark&sweep approach for fast removal from heaps.
This way, I got three pleasant results:

1) a substantial speed up!
2) an improved testsuite
3) discovery and fixing of several bugs

@Srinivas I would be honored if you could have a look at the implementation:
https://github.com/srkunze/xheap . After all, it was your idea. I only
perform the sweeping step during pop and remove with the condition of yours.
:)

Using the original xheap benchmark, I could see huge speedups: from 50x/25x
down to 3x/2x compared to heapq. That's a massive improvement. I will
publish an update soon.

Best,
Sven



--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to