[issue28580] Optimize iterating split table values

2017-03-31 Thread Donald Stufft

Changes by Donald Stufft :


--
pull_requests: +1015

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-04 Thread Xiang Zhang

Xiang Zhang added the comment:

Thanks!

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-04 Thread INADA Naoki

Changes by INADA Naoki :


--
resolution:  -> fixed
stage: commit review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-04 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 3904194d06e6 by INADA Naoki in branch 'default':
Issue #28580: Optimize iterating split table values.
https://hg.python.org/cpython/rev/3904194d06e6

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-04 Thread INADA Naoki

INADA Naoki added the comment:

LGTM. I'll commit.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-03 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

v4 LGTM.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-03 Thread Xiang Zhang

Changes by Xiang Zhang :


Added file: http://bugs.python.org/file45332/iterate_splittable_v4.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-03 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

In current code there is no UB in _PyDict_Next().

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-03 Thread Xiang Zhang

Xiang Zhang added the comment:

> I would suggest just remove assert() from your patch and address undefined 
> behavior in other issue.

That's what v2 does. If there is another issue, let's also leave _PyDict_Next 
to it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-03 Thread Xiang Zhang

Xiang Zhang added the comment:

Currently dict iterator does not allow size changed during iteration. This is 
more strict than list iterator but still allow modification during iteration. 
Maybe we could deny all modification by checking dict->ma_version_tag. But 
that's irrelevant to this issue.

Serhiy, this means the iternext* functions all get UB. These codes existed even 
in 3.3. Do we really need to eliminate them now?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-03 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

It is appropriate if iterating modifying dict raises RuntimeError, produces 
less items or even produce some items multiple times. What is not appropriate 
-- crash, hang and infinite iteration. There was a proposition about making 
this behavior more consistent (issue19332), but it was rejected.

I would suggest just remove assert() from your patch and address undefined 
behavior in other issue.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-03 Thread Xiang Zhang

Xiang Zhang added the comment:

Hmm, the resolution could be simple. But how about

>>> d = dict.fromkeys(range(100))
>>> for k in range(98):
... del d[k]
... 
>>> it = iter(d)
>>> next(it)
98
>>> d.clear()
>>> d[0] = 1
>>> d[1] = 2
>>> next(it)
Traceback (most recent call last):
  File "", line 1, in 
StopIteration

Actually we haven't exhaust the dict yet. Is it reasonable for me to expect the 
second next returning 0? I actually mean is it reasonable for me to expect 
len(dict) elements returned before StopIteration raised even if the dict is 
changed?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-03 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Following example causes a crash in debug build:

d = dict.fromkeys(range(100))
for k in range(99):
del d[k]

it = iter(d)
assert next(it) == 99
d.clear()
d[0] = None
next(it)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-02 Thread Xiang Zhang

Changes by Xiang Zhang :


Added file: http://bugs.python.org/file45330/iterate_splittable_v3.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-02 Thread INADA Naoki

INADA Naoki added the comment:

(Off topic note)

For readability, I prefer this style:

  PyDictKeyEntry *entries = DK_ENTRIES(mp->ma_keys);
  while (i < n && entries[i].me_value == NULL)
  i++;

But because sizeof(PyDictKeyEntry) is not 2^n, entries[i].me_value, i++
may be slower than entry_ptr->me_value, i++, entry_ptr++.

So we should prefer i++, entry_ptr++ style in case of iterating dict entries.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-02 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

I believe the patch doesn't makes iterating split table slower and likely makes 
it a little faster. The patch adds one additional check, but it removes other 
checks. Thus this still is an optimization.

But the last patch has an undefined behavior.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-02 Thread INADA Naoki

INADA Naoki added the comment:

LGTM again for iterate_splittable_v2.patch.

How about issue title (and commit message) to
"Cleanup iterating split table"?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-02 Thread INADA Naoki

INADA Naoki added the comment:

> Xiang Zhang added the comment:
>
> Yes, that's the point. I thought to expose an API in testcapimodule for test, 
> but actually I am not willing to do that since I don't believe this patch 
> could bring any visible performance change.
>

I agree with you.
I feel this patch is cleanup rather than optimization.

* Removing redundant loop which always break at first
* Adding very useful assertion to find potential bugs

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-02 Thread Xiang Zhang

Xiang Zhang added the comment:

Yes, that's the point. I thought to expose an API in testcapimodule for test, 
but actually I am not willing to do that since I don't believe this patch could 
bring any visible performance change.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-02 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

There is not easy to benchmark PyDict_Next. dict.__repr__ does much hard work 
(calling repr() for keys and values, formatting strings). Using 
_testcapi.test_dict_iteration perhaps is the easiest way to measure the 
performance of pure PyDict_Next, but it includes creating and deallocating of 
dicts.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-02 Thread Xiang Zhang

Xiang Zhang added the comment:

python -> unpatched, python3 -> patched

iterkeys:

(split)

./python3 -m perf timeit --compare-to /home/angwer/cpython/python -s 'from 
argparse import Namespace; ns = Namespace(); [setattr(ns, str(i), str(i)) for i 
in range(1)]' 'list(iter(ns.__dict__))'
python: . 112 us +- 1 us
python3: . 109 us +- 1 us

Median +- std dev: [python] 112 us +- 1 us -> [python3] 109 us +- 1 us: 1.03x 
faster

(combined)

./python3 -m perf timeit --compare-to /home/angwer/cpython/python -s 'd = {x:x 
for x in range(1)}' 'list(iter(d))'python: . 84.3 us +- 
2.4 us
python3: . 86.0 us +- 3.5 us

Median +- std dev: [python] 84.3 us +- 2.4 us -> [python3] 86.0 us +- 3.5 us: 
1.02x slower

pydict_next:

(split)

./python3 -m perf timeit --compare-to /home/angwer/cpython/python -s 'from 
argparse import Namespace; ns = Namespace(); [setattr(ns, str(i), str(i)) for i 
in range(1)]' 'repr(ns.__dict__)'
python: . 1.85 ms +- 0.01 ms
python3: . 1.85 ms +- 0.11 ms

Median +- std dev: [python] 1.85 ms +- 0.01 ms -> [python3] 1.85 ms +- 0.11 ms: 
1.00x faster

(combined)

./python3 -m perf timeit --compare-to /home/angwer/cpython/python -s 'd = {x:x 
for x in range(1)}' 'repr(d)'
python: . 1.99 ms +- 0.01 ms
python3: . 1.87 ms +- 0.01 ms

Median +- std dev: [python] 1.99 ms +- 0.01 ms -> [python3] 1.87 ms +- 0.01 ms: 
1.06x faster

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-02 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Could you please check how this change affects the performance?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-02 Thread Xiang Zhang

Xiang Zhang added the comment:

Update the patch to use more obvious comparison way. This uses INADA's 
suggestion and make the code more like other places.

I think the performance issue is better to be discussed in #28397. It doesn't 
have a significant impact on this patch. Hope we can move forward and not block 
on it.

--
nosy: +haypo
Added file: http://bugs.python.org/file45315/iterate_splittable_v2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-02 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

LGTM.

--
stage: patch review -> commit review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-02 Thread Xiang Zhang

Xiang Zhang added the comment:

Here is the new patch to apply the optimization to more places.

--
Added file: http://bugs.python.org/file45312/iterate_splittable.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28580] Optimize iterating split table values

2016-11-02 Thread Xiang Zhang

Xiang Zhang added the comment:

> Xiang, would you update patch?

Working on it.

--
title: Optimize _PyDict_Next for split table -> Optimize iterating split table 
values

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com