[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-18 Thread Joe Jevnik

New submission from Joe Jevnik:

When star unpacking positions into a function we can avoid a copy iff there are 
no extra arguments coming from the stack. Syntactically this looks like: 
`f(*args)`

This reduces the overhead of the call by about 25% (~30ns on my machine) based 
on some light profiling of just * unpacking. I can imagine this having a slight 
impact on code that uses this feature in a loop.

The cost is minimal when we need to make the copy because the int != 0 check is 
dwarfed by the allocation. I did not notice a significant change between the 
slow path and the original code.

The macro benchmark suite did not show a difference on my machine but this is 
not much more complex code.

--
components: Interpreter Core
files: call-function-var.patch
keywords: patch
messages: 263702
nosy: ll
priority: normal
severity: normal
status: open
title: Avoid copy in call_function_var when no extra stack args are passed
versions: Python 3.6
Added file: http://bugs.python.org/file42512/call-function-var.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-18 Thread SilentGhost

Changes by SilentGhost :


--
nosy: +haypo, martin.panter, yselivanov
stage:  -> patch review
type:  -> behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-18 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

stararg can be not exact tuple, but an instance of tuple subclass.

Could you provide a microbenchmark that shows the effect of the optimization?

--
nosy: +serhiy.storchaka
type: behavior -> performance

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-19 Thread Joe Jevnik

Joe Jevnik added the comment:

in _PyEval_EvalCodeWithName we will have already pulled the underlying array 
out of the tuple subclass and the then we will reconstruct the vararg value in 
the locals from this array. This means that with this patch we can get:

>>> class C(tuple): pass
... 
>>> def f(*args): print(type(args))
... 
>>> f(*C([1, 2, 3]))


The only case where we get potentially strange behavior is if we have a type 
whose tp_call did not forward to PyArg_Parse* or the PyTupleObject api and went 
though the abstract object interface. I ran the tests and the benchmark suite 
and did not run into any issues here but I can see how this would be 
potentially problematic behavior.

I have updated the code to also do a PyTuple_CheckExact against `stararg`. The 
tests I ran are:

timeit("h(*(a, b, c, d, e, f, g))", 'def h(a, b, c, d, e, f, g): pass\na, b, c, 
d, e, f, g = range(7)', number=int(10e7))

default: 17.191688070015516
patch: 14.060781285981648

There is also a really great case where you star unpack a tuple of literals 
which gets pushed into the co_consts:

timeit("f(*('a', 'b', 'c', 'd', 'e', 'f', 'g'))", 'def f(a, b, c, d, e, f, g): 
pass', number=int(10e7))

default: 13.842308042978402
patch: 9.696972723002546

This case seems far less common, but maybe someone has something like:

a = 1, 2, 3
...
f(*a)

--
type: performance -> 
Added file: http://bugs.python.org/file42515/call-function-var.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-19 Thread Joe Jevnik

Changes by Joe Jevnik :


--
type:  -> performance

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-19 Thread STINNER Victor

STINNER Victor added the comment:

+if (!nstar) {
+/* There are no positional arguments on the stack or in in a
+   sequence that was unpacked. */
+return PyTuple_New(0);
+}

It's possible that stararg is already an empty tuple. Is it worth to use it?

+if (!nstar) {
+if (stararg != NULL && PyTuple_CheckExact(stararg)) {
+Py_INCREF(stararg);
+return stararg;
+}
+else {
+/* There are no positional arguments on the stack or in in a
+   sequence that was unpacked. */
+return PyTuple_New(0);
+}
+}

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-19 Thread STINNER Victor

STINNER Victor added the comment:

call-function-var.patch LGTM.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-19 Thread STINNER Victor

STINNER Victor added the comment:

FYI I ran the full test suite with the second attached patch and all tests pass.

By the way, please add a suffix like -2, -3, etc. to your patches next time. 
It's easier to identify them if they have a different name ;-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-19 Thread Joe Jevnik

Joe Jevnik added the comment:

> It's possible that stararg is already an empty tuple. Is it worth to use it?

PyTuple_New(0) should return the unit tuple in most cases:

#if PyTuple_MAXSAVESIZE > 0
if (size == 0 && free_list[0]) {
op = free_list[0];
Py_INCREF(op);
#ifdef COUNT_ALLOCS
tuple_zero_allocs++;
#endif
return (PyObject *) op;
}


Looking at this again I think that checking if stararg is nonnull is more 
clear, I will update the patch (as call-function-var-3.patch). I cannot exactly 
rewrite the if to use the control flow you show because that would cause 
non-tuple subclasses to forward a stararg of () instead of copying it into a 
normal tuple when no arguments are passed ont the stack.

--
Added file: http://bugs.python.org/file42521/call-function-var-3.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-19 Thread Josh Rosenberg

Josh Rosenberg added the comment:

BTW, for a realistic use case that would be sped up by this patch (possibly a 
lot), consider a recursive function that is continually doing a "virtual" pop 
of the first argument, and receiving the rest as varargs, which are then 
unpacked for the recursive call, e.g., a function to make a multidimensional 
list populated with zeroes:

def nestedzeroes(firstdim, *dims):
'''Make multidimensional list populated with 0s'''
if not dims:
return [0] * firstdim
return [nestedzeroes(*dims) for _ in range(firstdim)]

It's somewhat less artificial in that it multiplies the effect of the 
optimization in a way that would actually exercise a microbenchmark-like 
scenario, where the amount of work involved in the star-args-only function 
calls is an appreciable percentage of the work. Running nestedzeroes(5, 5, 5) 
to create a 5x5x5 structure would involve 30 recursive calls; for 
nestedzeroes(10, 10, 10), 110 recursive calls.

I've actually used code like this (admittedly rarely) to avoid the hassle of 
inlining a many times nested set of list comprehensions to initialize a 
multidimensional list.

--
nosy: +josh.r

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-19 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

call-function-var-3.patch LGTM. Thank you for your contribution Joe.

--
assignee:  -> serhiy.storchaka
stage: patch review -> commit review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-19 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 15b20201cfa5 by Serhiy Storchaka in branch 'default':
Issue #26802: Optimized calling a function with *args only positional arguments.
https://hg.python.org/cpython/rev/15b20201cfa5

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-19 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

FYI, there is a proposition about constructing arguments tuple and dict in 
bytecode instead of ceval.c. This will significantly simplify CALL_FUNCTION 
(which will get just one optional tuple and one optional dict). Likely this 
idea will be implemented after changing to wordcode.

--
resolution:  -> fixed
stage: commit review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-19 Thread STINNER Victor

STINNER Victor added the comment:

Good news. On a microbenchmark, the patch makes func(*tuple) between 14% and 
18% faster. See attached bench.py. I ran the benchmark on Linux with isolated 
CPUs to get reliable results.
https://haypo-notes.readthedocs.org/microbenchmark.html

$ ./python-orig ~/prog/HG/misc/python/benchmark.py script bench.py --file=orig
$ ./python-patched ~/prog/HG/misc/python/benchmark.py script bench.py 
--file=patch
$ ./python-orig ~/prog/HG/misc/python/benchmark.py compare_to orig patch

Common platform:
CPU model: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
SCM: hg revision=75f40345d784 branch=default date="2016-04-19 22:29 +0200"
Timer info: namespace(adjustable=False, 
implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, 
resolution=1e-09)
Timer: time.perf_counter
Bits: int=32, long=64, long long=64, size_t=64, void*=64
Python unicode implementation: PEP 393
CFLAGS: -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall 
-Wstrict-prototypes
Platform: Linux-4.4.4-301.fc23.x86_64-x86_64-with-fedora-23-Twenty_Three

Platform of campaign orig:
Timer precision: 62 ns
Python version: 3.6.0a0 (default:75f40345d784, Apr 19 2016, 23:07:49) [GCC 
5.3.1 20151207 (Red Hat 5.3.1-2)]
Date: 2016-04-19 23:08:21

Platform of campaign patch:
Timer precision: 61 ns
Python version: 3.6.0a0 (default:084d5315dd50, Apr 19 2016, 23:06:38) [GCC 
5.3.1 20151207 (Red Hat 5.3.1-2)]
Date: 2016-04-19 23:08:39

-++--
Tests|   orig | 
patch
-++--
def func(a, b, c): pass args=(1, 2, 3); func(*args)  | 127 ns (*) | 104 ns 
(-18%)
def func(*args): pass args=(1, 2, 3); func(*args)| 147 ns (*) | 126 ns 
(-14%)
def func(*args): pass args=(2, 3, 4); func(1, *args) | 154 ns (*) |154 
ns
-++--
Total| 429 ns (*) | 384 ns 
(-10%)
-++--

--
Added file: http://bugs.python.org/file42524/bench.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-19 Thread STINNER Victor

STINNER Victor added the comment:

> New changeset 15b20201cfa5 by Serhiy Storchaka in branch 'default':

Oh. I was cooking a commit, you was faster than me :-)

IMHO it's worth to make the optimization in Misc/NEWS, or maybe even in Whats 
new in Python 3.6.

I prepared the text:

Issue #26802: Optimize function calls only using unpacking like 
``func(*tuple)`` (no other positional argument, no keyword): avoid copying the 
tuple. Such function calls are up to 15% faster. Patch written by Joe Jevnik.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-19 Thread STINNER Victor

STINNER Victor added the comment:

> FYI, there is a proposition about constructing arguments tuple and dict in 
> bytecode instead of ceval.c. This will significantly simplify CALL_FUNCTION 
> (which will get just one optional tuple and one optional dict). Likely this 
> idea will be implemented after changing to wordcode.

Wordcode is the issue #26647.

Do you have a reference to the more efficient implementation of CALL_FUNCTION? 
I recall vaguely this idea.

--

It was also proposed to pass keyword arguments as positional arguments: replace 
func(a=1, b=2) with func(1, 2) with "def func(a, b): ...". But this 
optimization requires something like FAT Python to disable the optimization if 
the function is replaced at runtime.

This optimization is more complex, maybe it's not worth.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-19 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

> Do you have a reference to the more efficient implementation of CALL_FUNCTION?

Actually it was about MAKE_FUNCTION. But I think the same is applicable to 
CALL_FUNCTION.

http://comments.gmane.org/gmane.comp.python.devel/157321

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-19 Thread Joe Jevnik

Joe Jevnik added the comment:

CALL_FUNCTION doesn't use extended arg; I don't see what we get by adding some 
opcode to convert the sequence into a tuple. We also don't want to box up the 
stack arguments into a tuple because when we do a CALL_FUNCTION to a python 
function we just copy the stack[-n:] elements into the locals of the next 
frame. Emitting a build_tuple here would be an unneeded allocation.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-19 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 6535481d610e by Victor Stinner in branch 'default':
Optimize func(*tuple) function call
https://hg.python.org/cpython/rev/6535481d610e

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-19 Thread STINNER Victor

STINNER Victor added the comment:

> New changeset 6535481d610e by Victor Stinner in branch 'default':
> Optimize func(*tuple) function call
> https://hg.python.org/cpython/rev/6535481d610e

Oops. Mercurial is too hard for me /o\ I didn't want to push this change, since 
Serhiy already pushed the change itself. I asked Serhiy what he thinks about 
documenting the optimization. Well, now at least it's documented in NEWS. But 
I'm still interested to know if it's worth to mention the change in What's New 
in Python 3.6?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-20 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

I was not sure about documenting this change in Misc/NEWS, but I think that it 
is not worth to mention it in What's New. This is very minor change. It's 
effect on any real code is negligible. I have pushed it only because it is 
trivial, unlikely has a negative effect, and there is a hope that the 
accumulated effect of several similar minor optimizations can be noticeable.

Simple refactoring or bug fix can have comparable performance effect on large 
program, but we don't mention this regression in What's New.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26802] Avoid copy in call_function_var when no extra stack args are passed

2016-04-20 Thread STINNER Victor

STINNER Victor added the comment:

Serhiy: "I was not sure about documenting this change in Misc/NEWS, but I think 
that it is not worth to mention it in What's New. This is very minor change. 
It's effect on any real code is negligible. I have pushed it only because it is 
trivial, unlikely has a negative effect, and there is a hope that the 
accumulated effect of several similar minor optimizations can be noticeable."

I documented other similar optimizations which probably have a negligible 
effect. I also expect a visible effect when other optimizations are accumulated.

https://docs.python.org/dev/whatsnew/3.6.html#optimizations

It also a deliberate choice to document optimization. It's to communicate on 
the fact that we *are working* on performance. It looks like users are happy to 
see work done on this area:
https://twitter.com/TalkPython/status/720704770198126594/photo/1

The drawback is maybe that users may have too high expectations on these 
optimizations.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com