[issue23290] Faster set copying

2015-05-13 Thread Raymond Hettinger

Changes by Raymond Hettinger :


--
resolution: later -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-05-13 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 79c884cc9625 by Raymond Hettinger in branch 'default':
Issue #23290:  Optimize set_merge() for cases where the target is empty.
https://hg.python.org/cpython/rev/79c884cc9625

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-05-13 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Except one error (the declaration after the code), the patch LGTM. It is shame 
to me that I left so much non-optimized code in specialized loops.

The result of my microbenchmarks is the speed up 51%, 153%, 23% and 96% 
respectively.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-05-12 Thread Raymond Hettinger

Changes by Raymond Hettinger :


Added file: http://bugs.python.org/file39353/set_faster_copy_6.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-05-12 Thread Raymond Hettinger

Changes by Raymond Hettinger :


Removed file: http://bugs.python.org/file39352/set_faster_copy_6.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-05-12 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Attaching a variant with several fix-ups (mostly minor):

* Changed the order of the three sections to go from 
most-restricted-most-optimized to the general-fall-through case.  The downside 
is that we test so->fill==0 twice.  The upside is that it corresponds to my way 
of thinking about the logic.

* Put the fill/used increments in the same order as the rest of the file.

* Loop over other_entry++ instead of using indexing.  This corresponds to my 
way of thinking about the entries and gives the compiler a stronger hint that 
it can avoid the indexing overhead.

* Removed the unnecessary dummy check from the direct_pointer_copy case.

* Switch the order of the same size and no dummies tests in the insert_clean 
case.

* Revise the comments to be clearer about the requirements for each case.

* Move the sometimes unnecessary hash variable assignments inside the 
conditional.

--
Added file: http://bugs.python.org/file39352/set_faster_copy_6.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-05-12 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Here is the patch with hoisted the conditionals out of the loop.

New microbenchmarking results:

$ ./python -m timeit -s "s = set(range(10**4))" -- "frozenset(s)"
Unpatched: 1000 loops, best of 3: 407 usec per loop
With patch #4: 1000 loops, best of 3: 325 usec per loop  (speed up 25%)
With patch #5: 1000 loops, best of 3: 272 usec per loop  (speed up 50%)

$ ./python -m timeit -s "s = {i+(j<<64) for i in range(10**4//2) for j in 
range(2)}" -- "frozenset(s)"
Unpatched: 1000 loops, best of 3: 995 usec per loop
With patch #4: 1000 loops, best of 3: 447 usec per loop  (speed up 123%)
With patch #5: 1000 loops, best of 3: 417 usec per loop  (speed up 139%)

$ ./python -m timeit -s "s = set(range(10**4)); s.add(-1); s.discard(-1)" -- 
"frozenset(s)"
Unpatched: 1000 loops, best of 3: 411 usec per loop
With patch #4: 1000 loops, best of 3: 355 usec per loop  (speed up 16%)
With patch #5: 1000 loops, best of 3: 406 usec per loop  (equal)

$ ./python -m timeit -s "s = {i+(j<<64) for i in range(10**4//2) for j in 
range(2)}; s.add(-1); s.discard(-1)" -- "frozenset(s)"
Unpatched: 1000 loops, best of 3: 1.01 msec per loop
With patch #4: 1000 loops, best of 3: 572 usec per loop  (speed up 77%)
With patch #5: 1000 loops, best of 3: 609 usec per loop  (speed up 66%)

--
Added file: http://bugs.python.org/file39351/set_faster_copy_5.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-05-12 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Le 12/05/2015 20:55, Raymond Hettinger a écrit :
> 
> Optimizations aren't new features.  We can still tweak the implementation 
> through-out the betas.

Not really. The period after the first beta is for fixing bugs, not for
risking introducing new bugs.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-05-12 Thread Raymond Hettinger

Raymond Hettinger added the comment:

I looked at the patch again and it is in pretty good shape.

Please hoist the conditionals out of the loop (for intelligibility and to let 
the compiler in-line more effectively).  Also, let's remove the "dump" and 
"clear" variable names in favor of comments that explain the conditionals (to 
introducing new terminology to the module). 

If you want, I'll take a crack at it in the next couple of days.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-05-12 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Optimizations aren't new features.  We can still tweak the implementation 
through-out the betas.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-05-12 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Doesn't the feature freeze start from beta 1?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-05-12 Thread Raymond Hettinger

Raymond Hettinger added the comment:

I don't like the patch as-is (comment change is wrong, test in the inner-loop 
making the common case pay for a short-cut for the uncommon case).  As I said 
before, the basic idea is good and I will very likely include some variant of 
this for Python3.5.

I marked this as low priority so I can work on other things (such as deque 
slicing) before the feature freeze and will come back to this after beta 1.  
Please remain calm.  My development time is limited but this is something I do 
want to get done.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-05-12 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Could you please pay a little attention to this issue Raymond?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-03-27 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

> I didn't expect that it's still possible to enhance set.

Raymond made a lot of set optimizations recently.

> By the way, is it possible to report latest enhancement to the dict type? I
> don't remember if tou tried that richard?

May be when Raymond port his optimizations to dict (issue18898).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-03-27 Thread STINNER Victor

STINNER Victor added the comment:

New patch looks good to me. Great job on speedup. I didn't expect that it's
still possible to enhance set.

By the way, is it possible to report latest enhancement to the dict type? I
don't remember if tou tried that richard?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-03-27 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Updated patch addresses Victor's comment. Thank you for your review Victor.

> Is msg234811 the result of set_faster_copy_3.patch?

Yes, it is. There are tests for sets with small and large number of hash 
conflicts, with and without dummy objects. The patch affects many operations 
that implicitly makes a copy, for example set1 | set2, set1 & set2, set1 - set2.

--
Added file: http://bugs.python.org/file38706/set_faster_copy_4.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-03-26 Thread STINNER Victor

STINNER Victor added the comment:

Is msg234811 the result of set_faster_copy_3.patch? If yes, I see no reason to 
not apply the patch: the patch is short and looks always fast, and sometimes 
*much* faster.

I just leaved a small comment on the review.

--
nosy: +haypo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-03-01 Thread Raymond Hettinger

Raymond Hettinger added the comment:

I still intend to apply some variant of your patch.  Am still looking at the 
options.

--
priority: normal -> low

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-03-01 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Ping. The last patch doesn't add more lookkey variants, it uses existing 
function.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-01-27 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Actually set_find_free_slot() is not needed because there is 
set_insert_clean(). Results with using set_insert_clean():

$ ./python -m timeit -s "s = set(range(10**4))" -- "frozenset(s)"
Unpatched: 1000 loops, best of 3: 700 usec per loop
Patched:   1000 loops, best of 3: 570 usec per loop
Speed up: 23%

$ ./python -m timeit -s "s = {i+(j<<64) for i in range(10**4//2) for j in 
range(2)}" -- "frozenset(s)"
Unpatched: 100 loops, best of 3: 2.2 msec per loop
Patched:   1000 loops, best of 3: 765 usec per loop
Speed up: 188%

$ ./python -m timeit -s "s = set(range(10**4)); s.add(-1); s.discard(-1)" -- 
"frozenset(s)"
Unpatched: 1000 loops, best of 3: 700 usec per loop
Patched:   1000 loops, best of 3: 605 usec per loop
Speed up: 16%

$ ./python -m timeit -s "s = {i+(j<<64) for i in range(10**4//2) for j in 
range(2)}; s.add(-1); s.discard(-1)" -- "frozenset(s)"
Unpatched: 100 loops, best of 3: 2.2 msec per loop
Patched:   1000 loops, best of 3: 1.06 msec per loop
Speed up: 108%

Note that unpatched version is 6% slower now than before, perhaps due to 
issue23119.

--
Added file: http://bugs.python.org/file37878/set_faster_copy_3.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-01-25 Thread Raymond Hettinger

Raymond Hettinger added the comment:

For the time being (while I working on other parts of sets), I'm holding-off an 
anything that adds more lookkey variants.   When the neatening-up and tweaking 
of search routines comes to a close, I'll revisit this one because it looks 
like there is a some low hanging fruit here.

--
resolution:  -> later

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-01-22 Thread Raymond Hettinger

Changes by Raymond Hettinger :


--
assignee:  -> rhettinger

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-01-21 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Here is even faster patch. When there are no dummies in source set we can just 
dump a table, placing entries at the same indices.

$ ./python -m timeit -s "s = set(range(10**4))" -- "frozenset(s)"
Unpatched: 1000 loops, best of 3: 658 usec per loop
Patched: 1000 loops, best of 3: 631 usec per loop

$ ./python -m timeit -s "s = {i+(j<<64) for i in range(10**3) for j in 
range(10)}" -- "frozenset(s)"
Unpatched: 100 loops, best of 3: 6.72 msec per loop
Patched: 1000 loops, best of 3: 930 usec per loop

$ ./python -m timeit -s "s = {i+(j<<64) for i in range(10**2) for j in 
range(10**2)}" -- "frozenset(s)"
Unpatched: 100 loops, best of 3: 14 msec per loop
Patched: 1000 loops, best of 3: 1.12 msec per loop

To test other branch we should add dummy entry: s.add(-1); s.discard(-1).

$ ./python -m timeit -s "s = {i+(j<<64) for i in range(10**3) for j in 
range(10)}; s.add(-1); s.discard(-1)" -- "frozenset(s)"
Unpatched: 1000 loops, best of 3: 661 usec per loop
Patched: 1000 loops, best of 3: 643 usec per loop

$ ./python -m timeit -s "s = {i+(j<<64) for i in range(10**3) for j in 
range(10)}; s.add(-1); s.discard(-1)" -- "frozenset(s)"
Unpatched: 100 loops, best of 3: 6.8 msec per loop
Patched: 100 loops, best of 3: 2.1 msec per loop

$ ./python -m timeit -s "s = {i+(j<<64) for i in range(10**2) for j in 
range(10**2)}; s.add(-1); s.discard(-1)" -- "frozenset(s)"
Unpatched: 100 loops, best of 3: 14 msec per loop
Patched: 100 loops, best of 3: 2.71 msec per loop

--
Added file: http://bugs.python.org/file37808/set_faster_copy_2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-01-21 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
keywords: +patch
nosy: +pitrou, rhettinger
Added file: http://bugs.python.org/file37807/set_faster_copy.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23290] Faster set copying

2015-01-21 Thread Serhiy Storchaka

New submission from Serhiy Storchaka:

Proposed patch makes faster creating new set from other set. The benefit is 
only few percents when there are no hash conflicts, but can be significant if 
there are hash duplicates.

$ ./python -m timeit -s "s = set(range(10**4))" -- "frozenset(s)"
Unpatched: 1000 loops, best of 3: 658 usec per loop
Patched: 1000 loops, best of 3: 620 usec per loop

$ ./python -m timeit -s "s = {i+(j<<64) for i in range(10**3) for j in 
range(10)}" -- "frozenset(s)"
Unpatched: 100 loops, best of 3: 6.72 msec per loop
Patched: 100 loops, best of 3: 2.05 msec per loop

$ ./python -m timeit -s "s = {i+(j<<64) for i in range(10**2) for j in 
range(10**2)}" -- "frozenset(s)"
Unpatched: 100 loops, best of 3: 14 msec per loop
Patched: 100 loops, best of 3: 2.67 msec per loop

It should also affect set.copy and operations which makes implicit copy (most 
set operators). The effect should be larger for objects with slow equality 
operator.

set_find_free_slot() can be used to prevent a catastrophic linear pile-up in 
issue23259.

--
components: Interpreter Core
messages: 234437
nosy: serhiy.storchaka
priority: normal
severity: normal
stage: patch review
status: open
title: Faster set copying
type: performance
versions: Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com