[issue24275] lookdict_* give up too soon

2021-04-28 Thread Inada Naoki


Change by Inada Naoki :


--
resolution: remind -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24275] lookdict_* give up too soon

2021-04-28 Thread Inada Naoki


Inada Naoki  added the comment:


New changeset 8557edbfa8f74514de82feea4c62f5963e4e0aa7 by Hristo Venev in 
branch 'master':
bpo-24275: Don't downgrade unicode-only dicts to mixed on lookups (GH-25186)
https://github.com/python/cpython/commit/8557edbfa8f74514de82feea4c62f5963e4e0aa7


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24275] lookdict_* give up too soon

2021-04-04 Thread Roundup Robot


Change by Roundup Robot :


--
nosy: +python-dev
nosy_count: 7.0 -> 8.0
pull_requests: +23928
pull_request: https://github.com/python/cpython/pull/25186

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24275] lookdict_* give up too soon

2021-03-31 Thread Hristo Venev


Hristo Venev  added the comment:

Sorry, I must have missed Inada Naoki's reply. I will try to send a pull 
request this weekend.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24275] lookdict_* give up too soon

2021-03-31 Thread Jim Jewett


Jim Jewett  added the comment:

What is the status on this?  If you are losing interest, would you like someone 
else to turn your patch into a pull request?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24275] lookdict_* give up too soon

2021-02-22 Thread Inada Naoki


Inada Naoki  added the comment:

I'm +1 for this. Would you create a pull request on GitHub?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24275] lookdict_* give up too soon

2021-01-31 Thread Inada Naoki


Change by Inada Naoki :


--
nosy: +methane

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24275] lookdict_* give up too soon

2021-01-31 Thread Hristo Venev


Hristo Venev  added the comment:

I've attached a patch that should also contain a test.

I also ran some benchmarks on dict creation/inserts. I couldn't notice any 
difference in performance.

--
Added file: 
https://bugs.python.org/file49782/0001-Don-t-downgrade-unicode-only-dicts-to-mixed-on-non-u.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24275] lookdict_* give up too soon

2021-01-30 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

Please test also creating a large dict with string-only or non-string keys. 
Since you add a code in insertdict() it can potentially affects performance.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24275] lookdict_* give up too soon

2021-01-30 Thread Hristo Venev


Hristo Venev  added the comment:

> Why is the first key built up as vx='x'; vx += '1' instead of just k1="x1"?

I wanted to construct a key that is equal to, but not the same object as, 
`'x1'`. Consider this example:

assert 'x1' is 'x1'
spam = 'x1'
assert spam is 'x1'
eggs = 'x'
eggs += '1'
assert eggs == 'x1'
assert eggs is not 'x1'
assert sys.intern(eggs) is 'x1'

When doing a dict lookup and the lookup key is the same object as a stored 
entry, `__eq__` is not called. Lookups are then significantly faster, maybe 20%.

Consider this example:

class EqTest:
def __eq__(self, other):
raise RuntimeError
def __hash__(self):
return id(self)

adict = {}
k1 = EqTest()
k2 = EqTest()

adict[k1] = 42
adict[k2] = 43
print(adict[k1], adict[k2])

Here `k1` is considered the same as `k1` and `k2` is considered the same as 
`k2`. However, `k1` and `k2` are considered distinct and never compared because 
they have different hashes.

However, if we were to set `EqTest.__hash__ = lambda self: 42`, we'd get a 
RuntimeError when we try to set `adict[k2]` because it would get compared for 
equality with `k1`.

Even if `__eq__` works, we can get some interesting behaviors. For example, 
when using multiple instances of `float('nan')` as keys.

> Using a str subclass in the test is a great idea, and you've created a truly 
> minimal one.  It would probably be good to *also* test with a non-string, 
> like 3 or 42.0.  I can't imagine this affecting things (unless you missed an 
> eager lookdict demotion somewhere), but it would be good to have that path 
> documented against regression.

I also tested a custom class that compares equal to strings. Other than being 
much slower, there weren't any significant differences. I also did some checks 
with int key lookups, which obviously fail with KeyError. They did not make the 
performance worse for the subsequent str lookups.

I will try to make a proper test tomorrow.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24275] lookdict_* give up too soon

2021-01-30 Thread Jim Jewett


Jim Jewett  added the comment:

Based on Hristo's timing, it appears to be a clear win.  

A near-wash for truly string-only dicts that shouldn't be effected; a near-wash 
for looking up non-(exact-)strings, and a nearly 40% speedup for the target 
case of looking up but not inserting a non-string or string subclass, then 
looking up strings thereafter. 

Additional comments:

Barring objections, I will promote from patch review to commit review when I've 
had a chance to look more closely.  I don't have commit privs, but I think some 
of the others following this issue do.

The test looks pretty good enough -- good enough that I wonder if I'm missing 
something on the parts that seem odd.  It would be great if you either cleaned 
them up or commented to explain why:

Why is the first key vx1, which seems, if anything, like a variable? 
 Why not k1 or string_key?

Why is the first key built up as vx='x'; vx += '1' instead of just k1="x1"?

Using a str subclass in the test is a great idea, and you've created a truly 
minimal one.  It would probably be good to *also* test with a non-string, like 
3 or 42.0.  I can't imagine this affecting things (unless you missed an eager 
lookdict demotion somewhere), but it would be good to have that path documented 
against regression.

This seems like a test that could probably be rolled into a bigger testfile for 
the actual commit.  I don't have the name of such an appropriate file at hand 
right now, but will try to find it on a deeper review.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24275] lookdict_* give up too soon

2021-01-30 Thread Jim Jewett


Jim Jewett  added the comment:

This was originally "can be reopened if a patch is submitted" and Hristo Venev 
has now done so. Therefore, I am reopening.

--
resolution: rejected -> remind
stage:  -> patch review
status: closed -> open
versions: +Python 3.10

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24275] lookdict_* give up too soon

2021-01-28 Thread Hristo Venev


Hristo Venev  added the comment:

Benchmark program attached.

0. Creates a dict with str keys
1. str lookups
2. str subclass lookups on the same dict
3. str lookups on the same dict

Results before patch:
0.9493069459404069 +- 0.004707371313935551
1.47313450980997 +- 0.01350596115630158
1.3181799192904144 +- 0.006550182814933545

Results after patch:
0.9498907704499289 +- 0.003721378313122522
1.4936510094799451 +- 0.057905386684185135
0.9494844124498195 +- 0.0029465760297623235

--
Added file: https://bugs.python.org/file49773/bench.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24275] lookdict_* give up too soon

2021-01-28 Thread Hristo Venev


Hristo Venev  added the comment:

I've attached a patch. I will soon provide benchmark results.

--
keywords: +patch
nosy: +h.venev
Added file: 
https://bugs.python.org/file49772/0001-Don-t-downgrade-unicode-only-dicts-to-mixed-on-non-u.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24275] lookdict_* give up too soon

2015-05-25 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

It is not obvious that the patch is needed. If you have ready patch and good 
benchmark results, you could reopen the issue. Otherwise status quo wins.

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24275] lookdict_* give up too soon

2015-05-25 Thread Mark Shannon

Mark Shannon added the comment:

I don't understand why this has been closed.
I agree with Jim's analysis.

Lookups do not change the dict and the choice of lookdict_* variant depends 
solely on the set of keys.

In fact, lookdict_split *doesn't* replace itself, it merely calls look_dict, 
leaving maintaining the invariant to insertdict.

Benjamin, could you please reopen this and mark it as needing a patch.

--
nosy: +Mark.Shannon

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24275] lookdict_* give up too soon

2015-05-24 Thread Benjamin Peterson

Benjamin Peterson added the comment:

You could have a non-unicode key that compares equal to a unicode string.

--
nosy: +benjamin.peterson
resolution:  -> rejected
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24275] lookdict_* give up too soon

2015-05-23 Thread Dmitry Kazakov

Changes by Dmitry Kazakov :


--
nosy: +vlth

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24275] lookdict_* give up too soon

2015-05-23 Thread Jim Jewett

New submission from Jim Jewett:

The specialized lookdict_* variants replace themselves with the generic 
lookdict as soon as a non-unicode key is looked up.  

They could reasonably leave the replacement to insertdict (line 819, currently 
assert rather than a replacement), when a non-unicode key is actually inserted 
into the dict. 

While inserts are less common than (all lookups including insert), I see the 
main advantage as reducing the number of duplications of the replacement logic.

--
components: Interpreter Core
messages: 243969
nosy: Jim.Jewett
priority: normal
severity: normal
status: open
title: lookdict_* give up too soon
type: enhancement

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com