[issue41835] Speed up dict vectorcall creation using keywords

2020-11-02 Thread Inada Naoki


Inada Naoki  added the comment:

While this is an interesting optimization, the gain is not enough.
I close this issue for now.

@Marco Sulla
Optimizing dict is a bit hard job. If you want to continue, I have an idea:
`dict(zip(keys, row))` is common use case. It is used by asdict() in datacalss, 
_asdict() in namedtuple, and csv DictReader.
Sniffing zip object and presizing dict may be interesting optimization.

But note that this idea has low chance of accepted too. We tries many ideas 
like this and reject them by ourselves even without creating a pull request.

--
resolution:  -> rejected
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-11-02 Thread Inada Naoki


Inada Naoki  added the comment:

And bench_kwcall.py is a microbenchmark for _PyEval_EvalCode.

$ cpython/release/python -m pyperf compare_to master.json kwcall-nodup.json

kwcall-3: Mean +- std dev: [master] 192 us +- 2 us -> [kwcall-nodup] 175 us +- 
1 us: 1.09x faster (-9%)
kwcall-6: Mean +- std dev: [master] 327 us +- 6 us -> [kwcall-nodup] 291 us +- 
4 us: 1.12x faster (-11%)
kwcall-9: Mean +- std dev: [master] 436 us +- 10 us -> [kwcall-nodup] 373 us +- 
5 us: 1.17x faster (-14%)

Geometric mean: 0.89 (faster)

--
Added file: https://bugs.python.org/file49561/bench_kwcall.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-11-02 Thread Inada Naoki


Inada Naoki  added the comment:

Short result (minspeed=2):

Slower (4):
- unpack_sequence: 65.2 ns +- 1.3 ns -> 69.2 ns +- 0.4 ns: 1.06x slower (+6%)
- unpickle_list: 5.21 us +- 0.04 us -> 5.44 us +- 0.02 us: 1.04x slower (+4%)
- chameleon: 9.80 ms +- 0.08 ms -> 10.0 ms +- 0.1 ms: 1.02x slower (+2%)
- logging_silent: 202 ns +- 5 ns -> 206 ns +- 5 ns: 1.02x slower (+2%)

Faster (9):
- pickle_dict: 30.7 us +- 0.1 us -> 29.0 us +- 0.1 us: 1.06x faster (-5%)
- scimark_lu: 169 ms +- 3 ms -> 163 ms +- 3 ms: 1.04x faster (-4%)
- sympy_str: 396 ms +- 8 ms -> 383 ms +- 5 ms: 1.04x faster (-3%)
- sqlite_synth: 3.46 us +- 0.08 us -> 3.34 us +- 0.04 us: 1.03x faster (-3%)
- scimark_fft: 415 ms +- 3 ms -> 405 ms +- 3 ms: 1.03x faster (-3%)
- pickle_list: 4.91 us +- 0.07 us -> 4.79 us +- 0.04 us: 1.03x faster (-3%)
- dulwich_log: 82.4 ms +- 0.8 ms -> 80.4 ms +- 0.8 ms: 1.02x faster (-2%)
- scimark_sparse_mat_mult: 5.49 ms +- 0.03 ms -> 5.37 ms +- 0.02 ms: 1.02x 
faster (-2%)
- spectral_norm: 157 ms +- 1 ms -> 153 ms +- 4 ms: 1.02x faster (-2%)

Benchmark hidden because not significant (47): ...

Geometric mean: 1.00 (faster)

Long result is attached.

--
Added file: https://bugs.python.org/file49560/pr23106.txt

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-11-02 Thread Inada Naoki


Inada Naoki  added the comment:

> I did PGO+LTO... --enable-optimizations --with-lto

I'm sorry about that. PGO+LTO *reduce* noises, but there are still noises. And 
unpack_sequence is very fragile.
I tried your branch again, and unpack_sequence is 10% *slower* than master 
branch.

I am running pyperformance with PR-23106, which simplifies your function and 
use it from _PyStack_AsDict() and _PyEval_EvalCode().

--
stage: patch review -> 

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-11-02 Thread Inada Naoki


Change by Inada Naoki :


--
pull_requests: +22023
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/23106

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-11-01 Thread Marco Sulla


Marco Sulla  added the comment:

I did PGO+LTO... --enable-optimizations --with-lto

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-10-31 Thread Inada Naoki


Inada Naoki  added the comment:

> It should *not* be affected by the change. Anyway, I run the bench other 10 
> times, and the lowest value with the CPython code without the PR is not lower 
> than 67.7 ns. With the PR, it reaches 53.5 ns. And I do not understand why.

The benchmark is very affected by code placement.
Even adding dead function affects speeds. Read vstinner's blog and presentation:

* https://vstinner.github.io/journey-to-stable-benchmark-deadcode.html
* https://speakerdeck.com/haypo/how-to-run-a-stable-benchmark?slide=9

That's why we recommend PGO+LTO build for benchmarking.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-10-31 Thread Marco Sulla


Marco Sulla  added the comment:

Well, actually Serhiy is right, it does not seem that the macro benchs did show 
something significant. Maybe the code can be used in other parts of CPython, 
for example in _pickle, where dicts are loaded. But it needs also to expose, 
maybe internally only, dictresize() and DICT_NEXT_VERSION(). Not sure it's 
something desirable.

There's something that I do not understand: the speedup to unpack_sequence. I 
checked the pyperformance code, and it's a microbench for:

a, b = some_sequence

It should *not* be affected by the change. Anyway, I run the bench other 10 
times, and the lowest value with the CPython code without the PR is not lower 
than 67.7 ns. With the PR, it reaches 53.5 ns. And I do not understand why. 
Maybe it affects the creation of the dicts with the local and global vars?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-10-31 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

Do not overestimate the importance of _PyStack_AsDict(). Most of calls (~90-95% 
or like) are with positional only arguments, and most of functions do not have 
var-keyword parameter. So efforts in last years were spent on optimizing common 
cases, in particularly avoiding creation of a dict without need. 
_PyStack_AsDict() can affect perhaps 1% of code, or less, and these functions 
are usually not performance critical.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-10-31 Thread Inada Naoki


Inada Naoki  added the comment:

> Both changes add significant amount of code (100 and 85 lines 
> correspondingly). Even if they speed up a particular case of dict constructor 
> it is not common use case.

You are right, but please wait.

Marco is new contributor and he can write correct C code for now.
So I am searching some parts which can be optimized by his code before 
rejecting it.

* bpo-42126, GH-22911: I can make dict display (aka. dict literal) 50% faster. 
But it introduce additional complexity to compiler and ceval. So I will reject 
it unless I find real world code using dict display in performance critical 
part.

* _PyStack_AsDict (https://github.com/methane/cpython/pull/25): I thought this 
is performance critical function. But I could not see significant performance 
gain in pyperformance.

* _PyEval_EvalCode 
(https://github.com/python/cpython/blob/master/Python/ceval.c#L4465): I am 
still not sure we can assume there are no duplicated keyword argument here. If 
we can assume it, we can optimize calling function receiving **kwds argument.

These three parts are all I found. I will reject this issue after I failed to 
optimize _PyEval_EvalCode.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-10-31 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

Both changes add significant amount of code (100 and 85 lines correspondingly). 
Even if they speed up a particular case of dict constructor it is not common 
use case.

I think that it would be better to reject these changes. They make maintenance 
harder, the benefit seems insignificant, and there is always a danger that new 
code can slow down other code. The dict object is performance critical for 
python, so it is better to not touch its code without need.

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-10-30 Thread Inada Naoki


Inada Naoki  added the comment:

unpack_sequence is very sensitive benchmark. Speed is dramatically changed by 
code alignment. PGO+LTO will reduce the noise, but we see noise always.

I believe there is no significant performance change in macro benchmarks when 
optimizing this part.

Not significant in macro benchmarks doesn't mean we must reject the 
optimization, because pyperformance doesn't cover whole application in the 
world.
But it means that we must be conservative about the optimization.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-10-30 Thread Marco Sulla


Marco Sulla  added the comment:

Well, following your example, since split dicts seems to be no more supported, 
I decided to be more drastic. If you see the last push in PR 22346, I do not 
check anymore but always resize, so the dict is always combined. This seems to 
be especially good for the "unpack_sequence" bench, even if I do not know what 
it is:

| chaos   | 132 ms   | 136 ms | 1.03x slower | 
Significant (t=-18.09) |
| crypto_pyaes| 136 ms   | 141 ms | 1.03x slower | 
Significant (t=-11.60) |
| float   | 133 ms   | 137 ms | 1.03x slower | 
Significant (t=-16.94) |
| go  | 276 ms   | 282 ms | 1.02x slower | 
Significant (t=-11.58) |
| logging_format  | 12.3 us  | 12.6 us| 1.02x slower | 
Significant (t=-9.75)  |
| logging_silent  | 194 ns   | 203 ns | 1.05x slower | 
Significant (t=-9.00)  |
| logging_simple  | 11.3 us  | 11.6 us| 1.02x slower | 
Significant (t=-12.56) |
| mako| 16.5 ms  | 17.4 ms| 1.05x slower | 
Significant (t=-17.34) |
| meteor_contest  | 116 ms   | 120 ms | 1.04x slower | 
Significant (t=-25.59) |
| nbody   | 158 ms   | 166 ms | 1.05x slower | 
Significant (t=-12.73) |
| nqueens | 107 ms   | 111 ms | 1.03x slower | 
Significant (t=-11.39) |
| pickle_pure_python  | 631 us   | 619 us | 1.02x faster | 
Significant (t=6.28)   |
| regex_compile   | 206 ms   | 214 ms | 1.04x slower | 
Significant (t=-24.24) |
| regex_v8| 28.4 ms  | 26.7 ms| 1.06x faster | 
Significant (t=10.92)  |
| richards| 87.8 ms  | 90.3 ms| 1.03x slower | 
Significant (t=-10.91) |
| scimark_lu  | 165 ms   | 162 ms | 1.02x faster | 
Significant (t=4.55)   |
| scimark_sor | 210 ms   | 215 ms | 1.02x slower | 
Significant (t=-10.14) |
| scimark_sparse_mat_mult | 6.45 ms  | 6.64 ms| 1.03x slower | 
Significant (t=-6.66)  |
| spectral_norm   | 158 ms   | 171 ms | 1.08x slower | 
Significant (t=-29.11) |
| sympy_expand| 599 ms   | 619 ms | 1.03x slower | 
Significant (t=-21.93) |
| sympy_str   | 376 ms   | 389 ms | 1.04x slower | 
Significant (t=-23.80) |
| sympy_sum   | 233 ms   | 239 ms | 1.02x slower | 
Significant (t=-14.70) |
| telco   | 7.40 ms  | 7.61 ms| 1.03x slower | 
Significant (t=-10.08) |
| unpack_sequence | 70.0 ns  | 56.1 ns| 1.25x faster | 
Significant (t=10.62)  |
| xml_etree_generate  | 108 ms   | 106 ms | 1.02x faster | 
Significant (t=5.52)   |
| xml_etree_iterparse | 133 ms   | 130 ms | 1.02x faster | 
Significant (t=11.33)  |
| xml_etree_parse | 208 ms   | 204 ms | 1.02x faster | 
Significant (t=9.19)   |

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-10-24 Thread Marco Sulla


Marco Sulla  added the comment:

I commented out sqlalchemy in the requirements.txt in the pyperformance source 
code, and it worked. I had also to skip tornado:

pyperformance run -r 
-b,-sqlalchemy_declarative,-sqlalchemy_imperative,-tornado_http -o 
../perf_master.json

This is my result:

pyperformance compare perf_master.json perf_dict_init.json -O table | grep 
Significant
| 2to3| 356 ms   | 348 ms  | 1.02x 
faster | Significant (t=7.28)   |
| fannkuch| 485 ms   | 468 ms  | 1.04x 
faster | Significant (t=9.68)   |
| pathlib | 22.5 ms  | 22.1 ms | 1.02x 
faster | Significant (t=13.02)  |
| pickle_dict | 29.0 us  | 30.3 us | 1.05x 
slower | Significant (t=-92.36) |
| pickle_list | 4.55 us  | 4.64 us | 1.02x 
slower | Significant (t=-10.87) |
| pyflate | 735 ms   | 702 ms  | 1.05x 
faster | Significant (t=6.67)   |
| regex_compile   | 197 ms   | 193 ms  | 1.02x 
faster | Significant (t=2.81)   |
| regex_v8| 24.5 ms  | 23.9 ms | 1.02x 
faster | Significant (t=17.63)  |
| scimark_fft | 376 ms   | 386 ms  | 1.03x 
slower | Significant (t=-15.07) |
| scimark_lu  | 154 ms   | 158 ms  | 1.03x 
slower | Significant (t=-12.94) |
| sqlite_synth| 3.35 us  | 3.21 us | 1.04x 
faster | Significant (t=17.65)  |
| telco   | 6.54 ms  | 7.14 ms | 1.09x 
slower | Significant (t=-8.51)  |
| unpack_sequence | 58.8 ns  | 61.5 ns | 1.04x 
slower | Significant (t=-19.66) |

It's strange that some benchmarks are slower, since the patch only do two 
additional checks to dict_vectorcall. Maybe they use many little dicts?

@methane:
> Would you implement some more optimization based on your PR to demonstrate 
> your idea?

I already done them, I'll do a PR.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-10-24 Thread Inada Naoki


Inada Naoki  added the comment:

I confirmed _PyDict_FromItems() can be used to optimize _PyStack_AsDict() too.
See https://github.com/methane/cpython/pull/25

But I can not confirm significant performance gain from it too.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-10-24 Thread Inada Naoki


Inada Naoki  added the comment:

@Marco Sulla

> @methane: well, to be honest, I don't see much difference between the two 
> pulls. The major difference is that you merged insertdict_init in 
> dict_merge_init.

Not only it but also some simplification which make 10% faster than GH-22346.

> But I kept insertdict_init separate on purpose, because this function can be 
> used in other future dedicated function on creation time only.

Where do you expect to use it? Would you implement some more optimization based 
on your PR to demonstrate your idea?

I confirmed that GH-22909 can be used to optimize BUILD_CONST_KEY_MAP 
(GH-22911). That's why I merged two functions.

> AssertionError: would build wheel with unsupported tag ('cp310', 'cp310', 
> 'linux_x86_64')

Try `pip install pyperformance==1.0.0`.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-10-24 Thread Inada Naoki


Inada Naoki  added the comment:

@Mark.Shannon I had seen some speedup on tornado benchmark when I didn't use 
PGO+LTO. but it was noise.

Now I use PGO+LTO. master vs PR-22909:

$ ./python -m pyperf compare_to master-opt.json speedup_kw-opt.json -G 
--min-speed=1
Slower (11):
- spectral_norm: 147 ms +- 1 ms -> 153 ms +- 2 ms: 1.04x slower (+4%)
- pickle_dict: 28.6 us +- 0.1 us -> 29.5 us +- 0.6 us: 1.03x slower (+3%)
- regex_compile: 199 ms +- 1 ms -> 204 ms +- 4 ms: 1.03x slower (+3%)
- chameleon: 9.75 ms +- 0.10 ms -> 9.99 ms +- 0.09 ms: 1.02x slower (+2%)
- logging_format: 10.9 us +- 0.2 us -> 11.1 us +- 0.2 us: 1.02x slower (+2%)
- sqlite_synth: 3.29 us +- 0.05 us -> 3.36 us +- 0.05 us: 1.02x slower (+2%)
- regex_v8: 26.1 ms +- 0.1 ms -> 26.5 ms +- 0.3 ms: 1.02x slower (+2%)
- json_dumps: 14.6 ms +- 0.1 ms -> 14.8 ms +- 0.1 ms: 1.02x slower (+2%)
- logging_simple: 9.88 us +- 0.18 us -> 10.0 us +- 0.2 us: 1.02x slower (+2%)
- nqueens: 105 ms +- 1 ms -> 107 ms +- 2 ms: 1.01x slower (+1%)
- raytrace: 511 ms +- 5 ms -> 517 ms +- 6 ms: 1.01x slower (+1%)

Faster (10):
- regex_dna: 233 ms +- 1 ms -> 229 ms +- 1 ms: 1.02x faster (-2%)
- unpickle: 14.7 us +- 0.1 us -> 14.5 us +- 0.2 us: 1.02x faster (-1%)
- deltablue: 8.17 ms +- 0.29 ms -> 8.06 ms +- 0.17 ms: 1.01x faster (-1%)
- mako: 16.8 ms +- 0.2 ms -> 16.6 ms +- 0.1 ms: 1.01x faster (-1%)
- xml_etree_iterparse: 117 ms +- 1 ms -> 116 ms +- 1 ms: 1.01x faster (-1%)
- scimark_monte_carlo: 117 ms +- 2 ms -> 115 ms +- 1 ms: 1.01x faster (-1%)
- xml_etree_parse: 164 ms +- 3 ms -> 162 ms +- 1 ms: 1.01x faster (-1%)
- unpack_sequence: 62.7 ns +- 0.7 ns -> 62.0 ns +- 0.7 ns: 1.01x faster (-1%)
- regex_effbot: 3.43 ms +- 0.01 ms -> 3.39 ms +- 0.02 ms: 1.01x faster (-1%)
- scimark_fft: 405 ms +- 4 ms -> 401 ms +- 1 ms: 1.01x faster (-1%)

Benchmark hidden because not significant (39)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-10-23 Thread Marco Sulla


Marco Sulla  added the comment:

@Mark.Shannon I tried to run pyperformance, but wheel does not work for Python 
3.10. I get the error:

AssertionError: would build wheel with unsupported tag ('cp310', 'cp310', 
'linux_x86_64')

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-10-23 Thread Marco Sulla


Marco Sulla  added the comment:

@methane: well, to be honest, I don't see much difference between the two 
pulls. The major difference is that you merged insertdict_init in 
dict_merge_init.

But I kept insertdict_init separate on purpose, because this function can be 
used in other future dedicated function on creation time only. Furthermore it's 
more simple to maintain, since it's quite identical to insertdict.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-10-23 Thread Mark Shannon


Mark Shannon  added the comment:

Could we get a pyperformance benchmark run on this please?

--
nosy: +Mark.Shannon

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-10-23 Thread Inada Naoki


Inada Naoki  added the comment:

@Marco Sulla Please take a look at GH-22909. It is simplified version of your 
PR. And I wrote another optimization based on it #42126.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-10-22 Thread Inada Naoki


Inada Naoki  added the comment:

Ok. Performance improvement comes from:

a. Presizing
b. Bypassing some checks in PyDict_SetItem
c. Avoiding duplication check.

(b) is relatively small so I tried to focus on (a) and (b). See GH-22909.

In case of simple keyword arguments, it is 10% faster than GH-22346:

```
$ ./python -m pyperf timeit --compare-to ./python-speedup_kw 
"dict(ihinvdono='doononon', gowwondwon='nwog', bdjbodbob='nidnnpn', 
nwonwno='vndononon', dooodbob='iohiwipwgpw', doidonooq='ndwnnpnpnp', 
fndionqinqn='ndjboqoqjb', nonoeoqgoqb='bdboboqbgoqeb', 
jdnvonvoddo='nvdjnvndvonoq', njnvodnoo='hiehgieba', nvdnvwnnp='wghgihpa', 
nvfnwnnq='nvdknnnqkm', ndonvnipnq='fndjnaobobvob', fjafosboab='ndjnodvobvojb', 
nownwnojwjw='nvknnndnow', niownviwnwnwi='nownvwinvwnwnwj')"
python-speedup_kw: . 357 ns +- 10 ns
python: . 323 ns +- 4 ns

Mean +- std dev: [python-speedup_kw] 357 ns +- 10 ns -> [python] 323 ns +- 4 
ns: 1.11x faster (-10%)
```

In case of `dict(d, key=val)` case, it is 8% slower than GH-22346, but still 8% 
faster than master.

```
$ ./python -m pyperf timeit --compare-to ./python-speedup_kw -s 
'd={"foo":"bar"}' "dict(d, ihinvdono='doononon', gowwondwon='nwog', 
bdjbodbob='nidnnpn', nwonwno='vndononon', dooodbob='iohiwipwgpw', 
doidonooq='ndwnnpnpnp', fndionqinqn='ndjboqoqjb', nonoeoqgoqb='bdboboqbgoqeb', 
jdnvonvoddo='nvdjnvndvonoq', njnvodnoo='hiehgieba', nvdnvwnnp='wghgihpa', 
nvfnwnnq='nvdknnnqkm', ndonvnipnq='fndjnaobobvob', fjafosboab='ndjnodvobvojb', 
nownwnojwjw='nvknnndnow', niownviwnwnwi='nownvwinvwnwnwj')"
python-speedup_kw: . 505 ns +- 15 ns
python: . 546 ns +- 17 ns

Mean +- std dev: [python-speedup_kw] 505 ns +- 15 ns -> [python] 546 ns +- 17 
ns: 1.08x slower (+8%)

$ ./python -m pyperf timeit --compare-to ./python-master -s 'd={"foo":"bar"}' 
"dict(d, ihinvdono='doononon', gowwondwon='nwog', bdjbodbob='nidnnpn', 
nwonwno='vndononon', dooodbob='iohiwipwgpw', doidonooq='ndwnnpnpnp', 
fndionqinqn='ndjboqoqjb', nonoeoqgoqb='bdboboqbgoqeb', 
jdnvonvoddo='nvdjnvndvonoq', njnvodnoo='hiehgieba', nvdnvwnnp='wghgihpa', 
nvfnwnnq='nvdknnnqkm', ndonvnipnq='fndjnaobobvob', fjafosboab='ndjnodvobvojb', 
nownwnojwjw='nvknnndnow', niownviwnwnwi='nownvwinvwnwnwj')"
python-master: . 598 ns +- 10 ns
python: . 549 ns +- 19 ns

Mean +- std dev: [python-master] 598 ns +- 10 ns -> [python] 549 ns +- 19 ns: 
1.09x faster (-8%)
```

Additionally, I expect we can reuse this new code to optimize 
BUILD_CONST_KEY_MAP.

--
stage: patch review -> 

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-10-22 Thread Inada Naoki


Change by Inada Naoki :


--
keywords: +patch
pull_requests: +21840
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/22909

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-10-22 Thread Marco Sulla


Marco Sulla  added the comment:

Another bench:

python -m pyperf timeit --rigorous "dict(ihinvdono='doononon', 
gowwondwon='nwog', bdjbodbob='nidnnpn', nwonwno='vndononon', 
dooodbob='iohiwipwgpw', doidonooq='ndwnnpnpnp', fndionqinqn='ndjboqoqjb', 
nonoeoqgoqb='bdboboqbgoqeb', jdnvonvoddo='nvdjnvndvonoq', 
njnvodnoo='hiehgieba', nvdnvwnnp='wghgihpa', nvfnwnnq='nvdknnnqkm', 
ndonvnipnq='fndjnaobobvob', fjafosboab='ndjnodvobvojb', 
nownwnojwjw='nvknnndnow', niownviwnwnwi='nownvwinvwnwnwj')"

Result without pull:
Mean +- std dev: 486 ns +- 8 ns

Result with pull:
Mean +- std dev: 328 ns +- 4 ns

I compiled both with optimizations and lto.

Some arch info:

python -VV
Python 3.10.0a1+ (heads/master-dirty:dde91b1953, Oct 22 2020, 14:00:51) 
[GCC 10.1.1 20200718]

uname -a
Linux buzz 4.15.0-118-generic #119-Ubuntu SMP Tue Sep 8 12:30:01 UTC 2020 
x86_64 x86_64 x86_64 GNU/Linux

lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:Ubuntu 18.04.5 LTS

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-09-23 Thread Marco Sulla


Marco Sulla  added the comment:

> `dict(**o)` is not common use case. Could you provide some other benchmarks?

You can do

python -m timeit -n 200 "dict(key1=1, key2=2, key3=3, key4=4, key5=5, 
key6=6, key7=7, key8=8, key9=9, key10=10)"

or with pyperf. In this case, since the dict is little, I observed a speedup of 
25%.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-09-22 Thread Inada Naoki


Inada Naoki  added the comment:

I have a Linux desktop machine for benchmarking & profiling in my office. But 
the machine is offline and I am working from home several weeks.
So please wait several weeks until I confirm your branch.

> This change speeds up the code up to a 30%. Tested with:
>
>  python -m timeit -n 2000  --setup "from uuid import uuid4 ; o =
>  {str(uuid4()).replace('-', '') : str(uuid4()).replace('-', '') for i
>  in range(1)}" "dict(**o)"

`dict(**o)` is not common use case. Could you provide some other benchmarks?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41835] Speed up dict vectorcall creation using keywords

2020-09-22 Thread Marco Sulla


New submission from Marco Sulla :

I've done a PR that speeds up the vectorcall creation of a dict using keyword 
arguments. The PR in practice creates a insertdict_init(), a specialized 
version of insertdict. I quote the comment to the function:

Same to insertdict but specialized for inserting without resizing and for dict 
that are populated in a loop and was empty before (see the empty arg).
Note that resizing must be done before calling this function. If not 
possible, use insertdict(). Furthermore, ma_version_tag is left unchanged, you 
have to change it after calling this function (probably at the end of a loop).

This change speeds up the code up to a 30%. Tested with:

python -m timeit -n 2000  --setup "from uuid import uuid4 ; o =
{str(uuid4()).replace('-', '') : str(uuid4()).replace('-', '') for i
in range(1)}" "dict(**o)"

--
components: Interpreter Core
messages: 377318
nosy: Marco Sulla, inada.naoki
priority: normal
pull_requests: 21398
severity: normal
status: open
title: Speed up dict vectorcall creation using keywords
versions: Python 3.10

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com