[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-02-06 Thread STINNER Victor
Changes by STINNER Victor : -- resolution: -> fixed stage: -> resolved status: open -> closed ___ Python tracker ___ ___ Python-bugs

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-02-01 Thread STINNER Victor
STINNER Victor added the comment: "The default branch is now as good as Python 3.4, in term of stack consumption, and Python 3.4 was the Python version which used the least stack memory according to my tests." I consider that the initial issue is now fixed, so I close the issue. Thanks Serhiy

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-11 Thread STINNER Victor
STINNER Victor added the comment: I also ran the reliable performance benchmark suite with LTO+PGO. There is no significant performance change on these benchmarks: https://speed.python.org/changes/?rev=b9404639a18c&exe=5&env=speed-python The largest change is on scimark_lu (-13%), but there was

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-10 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Awesome! You are great Victor! -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubsc

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-10 Thread STINNER Victor
STINNER Victor added the comment: Result of attached bench_recursion-2.py comparing before/after the 3 changes reducing the stack consumption: test_python_call: Median +- std dev: [a30cdf366c02] 512 us +- 12 us -> [6478e6d0476f] 467 us +- 21 us: 1.10x faster (-9%) test_python_getitem: Median +

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-10 Thread STINNER Victor
STINNER Victor added the comment: I pushed 3 changes: * rev b9404639a18c: Issue #29233: call_method() now uses _PyObject_FastCall() * rev 8481c379e2da: Issue #29227: inline call_function() * rev 6478e6d0476f: Issue #29234: disable _PyStack_AsTuple() inlining Before (rev a30cdf366c02): test_py

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-10 Thread STINNER Victor
STINNER Victor added the comment: It seems like subfunc.patch approach using the "no inline" attribute helps. -- ___ Python tracker ___ __

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-10 Thread STINNER Victor
STINNER Victor added the comment: I created the issue #29227 "Reduce C stack consumption in function calls" which contains a first simple patch with a significant effect on the C stack. -- ___ Python tracker _

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-10 Thread STINNER Victor
STINNER Victor added the comment: Stack used by each C function of test_python_call. 3.4: (a) method_call: 64 (b) PyObject_Call: 48 (b) function_call: 160 (b) PyEval_EvalCodeEx: 176 (c) PyEval_EvalFrameEx: 256 (c) call_function: 0 (c) do_call: 0 (c) PyObject_Call: 48 (d) slot_tp_call: 64 (d)

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-10 Thread STINNER Victor
STINNER Victor added the comment: > no_small_stack-2.patch decreases it only by 6% (with possible performance > loss). Yeah, if we want to come back to Python 3.4 efficiency, we need to find the other functions which now uses more stack memory ;-) The discussed "small stack" buffers are only

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-10 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Thus Python 3.6 stack usage is about 20% larger than Python 3.5 and about 40% larger than Python 3.4. This is significant. :-( no_small_stack-2.patch decreases it only by 6% (with possible performance loss). -- __

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-10 Thread STINNER Victor
STINNER Victor added the comment: no_small_stack-2.patch has a very bad impact on performances: haypo@speed-python$ python3 -m perf compare_to 2017-01-04_12-02-default-ee1390c9b585.json no_small_stack-2_refee1390c9b585.json -G --min-speed=5 Slower (59): - telco: 15.7 ms +- 0.5 ms -> 23.4 ms +

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-10 Thread STINNER Victor
STINNER Victor added the comment: Python 3.4 (rev 6340c9fcc111): test_python_call: 9700 calls before crash, stack: 864 bytes/call test_python_getitem: 8314 calls before crash, stack: 1008 bytes/call test_python_iterator: 7818 calls before crash, stack: 1072 bytes/call => total: 25832 calls, 294

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-10 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: What are results with 3.4? There were several issues about stack overflow in 3.5 (issue25222, issue28179, issue28913). -- ___ Python tracker

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-10 Thread STINNER Victor
STINNER Victor added the comment: Python 3.5 (revision 8125d9a8152b), before all fastcall changes: test_python_call: 8314 calls before crash, stack: 1008 bytes/call test_python_getitem: 7483 calls before crash, stack: 1120 bytes/call test_python_iterator: 6802 calls before crash, stack: 1232 byt

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-10 Thread STINNER Victor
STINNER Victor added the comment: > no_small_stack.patch: Oops, you should read no_small_stack-2.patch in my previous message ;-) -- ___ Python tracker ___ _

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-10 Thread STINNER Victor
STINNER Victor added the comment: no_small_stack-2.patch: Remove all "small_stack" buffers. Reference test_python_call: 7175 calls before crash, stack: 1168 bytes/call test_python_getitem: 6235 calls before crash, stack: 1344 bytes/call test_python_iterator: 5344 calls before crash, stack: 1568

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-10 Thread STINNER Victor
STINNER Victor added the comment: stack_overflow_28870-sp.py: script using testcapi_stack_pointer.patch to compute the usage of the C stack. Results of this script. (*) Reference test_python_call: 7175 calls before crash, stack: 1168 bytes/call test_python_getitem: 6235 calls before crash, sta

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-10 Thread STINNER Victor
STINNER Victor added the comment: testcapi_stack_pointer.patch: add _testcapi.stack_pointer() function. -- Added file: http://bugs.python.org/file46238/testcapi_stack_pointer.patch ___ Python tracker __

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-09 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I'm not sure that the result of pyobjectl_callfunctionobjargs_stacksize() has direct relation to stack consumption in test_python_call, test_python_getitem and test_python_iterator. Try to measure the stack consumption in these cases. This can be done with _

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-09 Thread STINNER Victor
STINNER Victor added the comment: Impact of the _PY_FASTCALL_SMALL_STACK constant: * _PY_FASTCALL_SMALL_STACK=1: 528 bytes/call test_python_call 7376 test_python_getitem 6544 test_python_iterator 5572 => total: 19 492 * _PY_FASTCALL_SMALL_STACK=3: 528 bytes/call test_python_call 7272 test_pyt

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-09 Thread STINNER Victor
STINNER Victor added the comment: I modified Serhiy's stack_overflow.py of #28858: * re-run each test 10 tests and show the maximum depth * only test: ['test_python_call', 'test_python_getitem', 'test_python_iterator'] Maximum number of Python calls before a crash. (*) Reference (unpatched): 56

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-09 Thread Xiang Zhang
Changes by Xiang Zhang : -- nosy: +xiang.zhang ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.pyt

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-02 Thread STINNER Victor
STINNER Victor added the comment: In Python 3.5, PyObject_CallFunctionObjArgs() calls objargs_mktuple() which uses Py_VA_COPY(countva, va) and creates a tuple. The tuple constructor uses a free list to reduce the cost of heap memory allocations. --

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-02 Thread STINNER Victor
STINNER Victor added the comment: no_small_stack.patch: And now something completely different, a patch to remove the "small stack" alllocated on the C stack, always use the heap memory. FYI I created no_small_stack.patch from less_stack.patch. As expected, the stack usage is lower: * less_st

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2017-01-02 Thread STINNER Victor
STINNER Victor added the comment: testcapi_stacksize.patch: add _testcapi.pyobjectl_callfunctionobjargs_stacksize(), function used to measure the stack consumption. -- Added file: http://bugs.python.org/file46119/testcapi_stacksize.patch ___ Python

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2016-12-15 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I have tested all three patches with the stack_overflow.py script. The only affected are recursive Python implementations of __call__, __getitem__ and __iter__. unpatched less_stack alloca subfunc test_python_call9696

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2016-12-15 Thread STINNER Victor
STINNER Victor added the comment: For comparison, Python 3.5 (before fast calls) uses 448 bytes of C stack per call. Python 3.5 uses a tuple allocated in the heap memory. -- ___ Python tracker

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2016-12-15 Thread STINNER Victor
STINNER Victor added the comment: I also tried Serhiy's approach, split the function into subfunctions, but the result is not as good as expected: 496 bytes. See attached subfunc.patch. -- Added file: http://bugs.python.org/file45918/subfunc.patch __

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2016-12-15 Thread STINNER Victor
STINNER Victor added the comment: I also tried to use alloca(): see attached alloca.patch. But the result is quite bad: 528 bytes of stack memory per call. I only attach the patch to discuss the issue, but I now dislike the option: the result is bad, it's less portable and more dangerous. ---

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2016-12-15 Thread STINNER Victor
STINNER Victor added the comment: I don't propose to add _testcapi.pyobjectl_callfunctionobjargs_stacksize(). It's just to test the patch. I'm using it with: $./python -c 'import _testcapi; n=100; print(_testcapi.pyobjectl_callfunctionobjargs_stacksize(n) / (n+1))' 384.0 The value of n has no

[issue28870] Reduce stack consumption of PyObject_CallFunctionObjArgs() and like

2016-12-15 Thread STINNER Victor
Changes by STINNER Victor : -- title: Refactor PyObject_CallFunctionObjArgs() and like -> Reduce stack consumption of PyObject_CallFunctionObjArgs() and like ___ Python tracker