[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-22 Thread Catalin Gabriel Manciu

Catalin Gabriel Manciu added the comment:

I've just posted the results to an OpenStack Swift benchmark run using the 
patch from my proposition, issue #26382. 
Victor's patch, applied to CPython 2.7, adds an extra 1% compared to mine 
(which improved throughput by 1%), effectively doubling the performance gain. 
Swift is a highly complex real-world workload, so this result is quite 
significant.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-22 Thread STINNER Victor

STINNER Victor added the comment:

> Compared to my proposition in issue #26382, this patch yields slightly better 
> results for CPython 3.6, gaining an average of +0.36% on GUPB,
and similar results for CPython 2.7.

IMHO this change is too young to be backported to Python 2.7. I wrote it for 
Python 3.6 only. For Python 2.7, I suggest to write patches with narrow scope, 
as you did for the patch only modifying the list type.

"""
Table 1: CPython 3 GUPB results
---
unpickle_list   22.74%
mako_v2 9.13%
nqueens 6.32%
meteor_contest  5.61%
fannkuch5.34%
simple_logging  5.28%
formatted_logging   5.06%
"""

I surprised to see slow-down, but I prefer to think that changes smaller than 
5% are pure noise.

The good news is the long list of benchmarks with speedup larger than 5.0% :-) 
22% on unpick list is nice to have too!

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-22 Thread Catalin Gabriel Manciu

Catalin Gabriel Manciu added the comment:

Hi all,

Please find below the results from a complete GUPB run on a patched CPython 
3.6. In average, an improvement of about 2.1% can be observed. 

I'm also attaching an implementation of the patch for CPython 2.7 and its 
benchmark results. On GUPB the average performance boost is 1.5%. 
In addition we are also seeing a 2.1% increase in throughput rate from our 
OpenStack Swift setup as measured by ssbench.

Compared to my proposition in issue #26382, this patch yields slightly better 
results for CPython 3.6, gaining an average of +0.36% on GUPB,
and similar results for CPython 2.7.


Hardware and OS configuration:
==
Hardware:   Intel XEON (Haswell-EP)

BIOS settings:  Intel Turbo Boost Technology: false
Hyper-Threading: false  

OS: Ubuntu 14.04.2 LTS

OS configuration:   Address Space Layout Randomization (ASLR) disabled to 
reduce run
to run variation by echo 0 > 
/proc/sys/kernel/randomize_va_space
CPU frequency set fixed at 2.3GHz

Repository info:

CPython2 : 2d8e8d0e7162 (2.7)
CPython3 : f9391e2b74a5 tip

Results
===

Table 1: CPython 3 GUPB results
---
unpickle_list   22.74%
mako_v2 9.13%
nqueens 6.32%
meteor_contest  5.61%
fannkuch5.34%
simple_logging  5.28%
formatted_logging   5.06%
fastunpickle4.37%
json_dump_v23.10%
regex_compile   3.01%
raytrace2.95%
pathlib 2.43%
tornado_http2.22%
django_v3   1.94%
telco   1.65%
pickle_list 1.59%
chaos   1.50%
etree_process   1.48%
fastpickle  1.34%
silent_logging  1.12%
2to31.09%
float   1.01%
nbody   0.89%
normal_startup  0.86%
startup_nosite  0.79%
richards0.67%
regex_v80.61%
etree_generate  0.57%
hexiom2 0.54%
pickle_dict 0.20%
call_simple 0.18%
spectral_norm   0.17%
regex_effbot0.16%
unpack_sequence 0.00%
call_method_unknown-0.04%
chameleon_v2   -0.07%
json_load  -0.08%
etree_parse-0.09%
pidigits   -0.15%
go -0.16%
etree_iterparse-0.22%
call_method_slots  -0.49%
call_method-0.97%


Table 2: CPython 2 GUPB results
---
unpickle_list   16.88%
json_load   11.74%
fannkuch8.11%
mako_v2 6.91%
meteor_contest  6.27%
slowpickle  4.81%
nqueens 4.46%
html5lib_warmup 3.53%
chaos   2.67%
regex_v82.56%
html5lib2.34%
fastunpickle2.32%
tornado_http2.23%
rietveld2.15%
simple_logging  1.82%
normal_startup  1.57%
call_method_slots   1.53%
telco   1.49%
regex_compile   1.47%
spectral_norm   1.36%
hg_startup  1.27%
regex_effbot1.18%
nbody   1.02%
2to31.01%
pybench 0.99%
chameleon_v20.98%
slowunpickle0.93%
startup_nosite  0.92%
pickle_list 0.89%
richards0.56%
django_v3   0.48%
json_dump_v20.41%
raytrace0.38%
unpack_sequence 0.00%
float  -0.05%
slowspitfire   -0.07%
go -0.24%
hexiom2-0.26%
spambayes  -0.27%
pickle_dict-0.30%
etree_parse-0.32%
pidigits   -0.41%
etree_iterparse-0.47%
bzr_startup-0.55%
fastpickle -0.74%
etree_process  -0.96%
formatted_logging  -1.01%
call_simple-1.08%
pathlib-1.12%
silent_logging -1.22%
etree_generate -1.23%
call_method_unknown-2.14%
call_method-2.22%

Table 3: OpenStack Swift ssbench results

ssbench 2.11%

--
nosy: +catalin.manciu
Added file: http://bugs.python.org/file42004/pymem_27.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-14 Thread Alecsandru Patrascu

Changes by Alecsandru Patrascu :


--
nosy: +alecsandru.patrascu

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-02 Thread STINNER Victor

STINNER Victor added the comment:

(Crap. I sent an incomplete message, sorry about that.)

> Hum, it looks like jemalloc uses *more* memory than libc memory allocators. I 
> don't know if it's a known 

I don't know if it's a known issue/property of jemalloc.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-02 Thread STINNER Victor

STINNER Victor added the comment:

Yury: "Please use -r flag for perf.py"

Oh, I didn't know this flag. Sure, I can do that.

New benchmark using --rigorous to measure the performance of attached 
pymem.patch.

It always seems faster, newer slower.

Report on Linux smithers 4.3.3-300.fc23.x86_64 #1 SMP Tue Jan 5 23:31:01 UTC 
2016 x86_64 x86_64
Total CPU cores: 8

### 2to3 ###
Min: 6.772531 -> 6.686245: 1.01x faster
Avg: 6.875264 -> 6.726859: 1.02x faster
Significant (t=3.44)
Stddev: 0.09026 -> 0.03398: 2.6560x smaller

### django_v3 ###
Min: 0.562797 -> 0.552539: 1.02x faster
Avg: 0.591345 -> 0.557561: 1.06x faster
Significant (t=4.17)
Stddev: 0.07689 -> 0.02581: 2.9794x smaller

### fastpickle ###
Min: 0.464270 -> 0.437667: 1.06x faster
Avg: 0.467195 -> 0.442298: 1.06x faster
Significant (t=10.59)
Stddev: 0.01156 -> 0.02046: 1.7693x larger

### fastunpickle ###
Min: 0.548834 -> 0.526554: 1.04x faster
Avg: 0.554601 -> 0.539456: 1.03x faster
Significant (t=4.67)
Stddev: 0.01137 -> 0.03040: 2.6734x larger

### json_dump_v2 ###
Min: 2.723152 -> 2.603108: 1.05x faster
Avg: 2.749255 -> 2.693655: 1.02x faster
Significant (t=2.89)
Stddev: 0.03016 -> 0.18988: 6.2963x larger

### regex_v8 ###
Min: 0.044256 -> 0.042201: 1.05x faster
Avg: 0.044733 -> 0.043134: 1.04x faster
Significant (t=4.55)
Stddev: 0.00201 -> 0.00288: 1.4309x larger

### tornado_http ###
Min: 0.253405 -> 0.247401: 1.02x faster
Avg: 0.256274 -> 0.250380: 1.02x faster
Significant (t=17.48)
Stddev: 0.00285 -> 0.00382: 1.3430x larger

The following not significant results are hidden, use -v to show them:
chameleon_v2, json_load, nbody.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-02 Thread STINNER Victor

STINNER Victor added the comment:

> Test with jemalloc using the shell script "python.jemalloc":
> ---
> #!/bin/sh
> LD_PRELOAD=/usr/lib64/libjemalloc.so /home/haypo/prog/python/default/python 
> "$@"
> ---

"perf.py -m" doesn't work with such bash script, but it works using exec:
---
#!/bin/sh
LD_PRELOAD=/usr/lib64/libjemalloc.so exec 
/home/haypo/prog/python/default/python "$@"
---

> Memory consumption:
python3 -u perf.py -m ../default/python ../default/python.jemalloc


Hum, it looks like jemalloc uses *more* memory than libc memory allocators. I 
don't know if it's a known 


Report on Linux smithers 4.3.3-300.fc23.x86_64 #1 SMP Tue Jan 5 23:31:01 UTC 
2016 x86_64 x86_64
Total CPU cores: 8

### 2to3 ###
Mem max: 43088.000 -> 43776.000: 1.0160x larger

### chameleon_v2 ###
Mem max: 367028.000 -> 626324.000: 1.7065x larger

### django_v3 ###
Mem max: 23824.000 -> 25120.000: 1.0544x larger

### fastpickle ###
Mem max: 8696.000 -> 9712.000: 1.1168x larger

### fastunpickle ###
Mem max: 8708.000 -> 9696.000: 1.1135x larger

### json_dump_v2 ###
Mem max: 10488.000 -> 11556.000: 1.1018x larger

### json_load ###
Mem max: 8444.000 -> 9396.000: 1.1127x larger

### nbody ###
Mem max: 7392.000 -> 8416.000: 1.1385x larger

### regex_v8 ###
Mem max: 12760.000 -> 13576.000: 1.0639x larger

### tornado_http ###
Mem max: 28196.000 -> 29920.000: 1.0611x larger

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-02 Thread STINNER Victor

STINNER Victor added the comment:

> Why have two sets of functions doing exactly the same thing?

I have no idea.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-02 Thread STINNER Victor

STINNER Victor added the comment:

Test with jemalloc using the shell script "python.jemalloc":
---
#!/bin/sh
LD_PRELOAD=/usr/lib64/libjemalloc.so /home/haypo/prog/python/default/python "$@"
---

Memory consumption:
python3 -u perf.py -m ../default/python ../default/python.jemalloc

Report on Linux smithers 4.3.3-300.fc23.x86_64 #1 SMP Tue Jan 5 23:31:01 UTC 
2016 x86_64 x86_64
Total CPU cores: 8

### 2to3 ###
Mem max: 43100.000 -> 220.000: 195.9091x smaller

### chameleon_v2 ###
Mem max: 367276.000 -> 224.000: 1639.6250x smaller

### django_v3 ###
Mem max: 24136.000 -> 284.000: 84.9859x smaller

### fastpickle ###
Mem max: 8692.000 -> 284.000: 30.6056x smaller

### fastunpickle ###
Mem max: 8704.000 -> 216.000: 40.2963x smaller

### json_dump_v2 ###
Mem max: 10448.000 -> 216.000: 48.3704x smaller

### json_load ###
Mem max: 8444.000 -> 220.000: 38.3818x smaller

### nbody ###
Mem max: 7388.000 -> 220.000: 33.5818x smaller

### regex_v8 ###
Mem max: 12764.000 -> 220.000: 58.0182x smaller

### tornado_http ###
Mem max: 28216.000 -> 228.000: 123.7544x smaller





Performance:

Report on Linux smithers 4.3.3-300.fc23.x86_64 #1 SMP Tue Jan 5 23:31:01 UTC 
2016 x86_64 x86_64
Total CPU cores: 8

### 2to3 ###
7.413484 -> 7.189792: 1.03x faster

### chameleon_v2 ###
Min: 5.559697 -> 5.869468: 1.06x slower
Avg: 5.672448 -> 6.033152: 1.06x slower
Significant (t=-13.67)
Stddev: 0.12098 -> 0.14203: 1.1740x larger

### nbody ###
Min: 0.242194 -> 0.229747: 1.05x faster
Avg: 0.244991 -> 0.235297: 1.04x faster
Significant (t=9.75)
Stddev: 0.00262 -> 0.00652: 2.4861x larger

### regex_v8 ###
Min: 0.042532 -> 0.046920: 1.10x slower
Avg: 0.043249 -> 0.047907: 1.11x slower
Significant (t=-13.23)
Stddev: 0.00180 -> 0.00172: 1.0503x smaller

### tornado_http ###
Min: 0.265755 -> 0.274526: 1.03x slower
Avg: 0.273617 -> 0.284186: 1.04x slower
Significant (t=-6.67)
Stddev: 0.00583 -> 0.01474: 2.5297x larger

The following not significant results are hidden, use -v to show them:
django_v3, fastpickle, fastunpickle, json_dump_v2, json_load.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-02 Thread STINNER Victor

STINNER Victor added the comment:

FYI benchmark result to compare Python with and without pymalloc (fast memory 
allocator for block <= 512 bytes). As expected, no pymalloc is slower, up to 
30% slower (and it's never faster).

Report on Linux smithers 4.3.3-300.fc23.x86_64 #1 SMP Tue Jan 5 23:31:01 UTC 
2016 x86_64 x86_64
Total CPU cores: 8

### 2to3 ###
7.253671 -> 7.558993: 1.04x slower

### chameleon_v2 ###
Min: 5.598481 -> 5.794526: 1.04x slower
Avg: 5.714233 -> 5.922142: 1.04x slower
Significant (t=-8.01)
Stddev: 0.15956 -> 0.09048: 1.7636x smaller

### django_v3 ###
Min: 0.574221 -> 0.606462: 1.06x slower
Avg: 0.579659 -> 0.612088: 1.06x slower
Significant (t=-28.44)
Stddev: 0.00605 -> 0.00532: 1.1371x smaller

### fastpickle ###
Min: 0.450852 -> 0.502645: 1.11x slower
Avg: 0.455619 -> 0.513777: 1.13x slower
Significant (t=-26.24)
Stddev: 0.00696 -> 0.01404: 2.0189x larger

### fastunpickle ###
Min: 0.544064 -> 0.696306: 1.28x slower
Avg: 0.552459 -> 0.705372: 1.28x slower
Significant (t=-85.52)
Stddev: 0.00798 -> 0.00980: 1.2281x larger

### json_dump_v2 ###
Min: 2.780312 -> 3.265531: 1.17x slower
Avg: 2.830463 -> 3.370060: 1.19x slower
Significant (t=-23.73)
Stddev: 0.04190 -> 0.15521: 3.7046x larger

### json_load ###
Min: 0.428893 -> 0.558956: 1.30x slower
Avg: 0.431941 -> 0.569441: 1.32x slower
Significant (t=-74.76)
Stddev: 0.00791 -> 0.01033: 1.3060x larger

### regex_v8 ###
Min: 0.043439 -> 0.044614: 1.03x slower
Avg: 0.044388 -> 0.046487: 1.05x slower
Significant (t=-4.95)
Stddev: 0.00215 -> 0.00209: 1.0283x smaller

### tornado_http ###
Min: 0.264603 -> 0.278840: 1.05x slower
Avg: 0.270153 -> 0.285263: 1.06x slower
Significant (t=-23.04)
Stddev: 0.00489 -> 0.00436: 1.1216x smaller

The following not significant results are hidden, use -v to show them:
nbody.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-02 Thread STINNER Victor

STINNER Victor added the comment:

>> It looks like some benchmarks are up to 4% faster:

> What this says is that some internals uses of PyMem_XXX should be replaced 
> with PyObject_XXX.

Why not changing PyMem_XXX to use the same fast allocator than PyObject_XXX? 
(as proposed in this issue)

FYI we now also have the PyMem_RawXXX family :)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-02 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Le 02/02/2016 15:48, STINNER Victor a écrit :
>> What this says is that some internals uses of PyMem_XXX should be replaced 
>> with PyObject_XXX.
> 
> Why not changing PyMem_XXX to use the same fast allocator than
PyObject_XXX? (as proposed in this issue)

Why have two sets of functions doing exactly the same thing?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-02 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Le 02/02/2016 15:47, STINNER Victor a écrit :
> 
> ### 2to3 ###
> Mem max: 43100.000 -> 220.000: 195.9091x smaller
> 
> ### chameleon_v2 ###
> Mem max: 367276.000 -> 224.000: 1639.6250x smaller
> 
> ### django_v3 ###
> Mem max: 24136.000 -> 284.000: 84.9859x smaller

These figures are not even remotely believable.
It would make sense to investigate them before posting such numbers ;-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-02 Thread STINNER Victor

STINNER Victor added the comment:

> These figures are not even remotely believable.

To be honest, I didn't try to understand them :-) Are they the number of kB of 
the RSS memory?

Maybe perf.py doesn't like my shell script?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-02 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Hum, the point of PyMem_Malloc() is that it's distinct from PyObject_Malloc(), 
right? Why would you redirect one to the other?

--
nosy: +pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-02 Thread STINNER Victor

STINNER Victor added the comment:

About heap memory fragmentation, see also my attached two "benchmarks" in 
Python and C: python_memleak.py and tu_malloc.c.

--
Added file: http://bugs.python.org/file41778/python_memleak.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-02 Thread STINNER Victor

STINNER Victor added the comment:

> Hum, the point of PyMem_Malloc() is that it's distinct from 
> PyObject_Malloc(), right? Why would you redirect one to the other?

For performances.

> (of course, we might question why we have two different families of 
> allocation APIs...)

That's the real question: why does Python have PyMem family? Is it still 
justified in 2016?

--

Firefox uses jemalloc to limit the fragmentation of the heap memory. Once I 
spent a lot of time to try to understand the principle of fragmentation, and in 
my tiny benchmarks, jemalloc was *much* better than system allocator. By the 
way, jemalloc scales well on multiple threads ;-)

* http://www.canonware.com/jemalloc/
* https://github.com/jemalloc/jemalloc/wiki

My notes on heap memory fragmentation: 
http://haypo-notes.readthedocs.org/heap_fragmentation.html

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-02 Thread STINNER Victor

STINNER Victor added the comment:

So, I ran ssh://h...@hg.python.org/benchmarks with my patch. It looks like some 
benchmarks are up to 4% faster:

$ python3 -u perf.py ../default/python.orig ../default/python.pymem

INFO:root:Automatically selected timer: perf_counter
[ 1/10] 2to3...
INFO:root:Running `../default/python.pymem lib3/2to3/2to3 -f all lib/2to3`
INFO:root:Running `../default/python.pymem lib3/2to3/2to3 -f all lib/2to3` 1 
time
INFO:root:Running `../default/python.orig lib3/2to3/2to3 -f all lib/2to3`
INFO:root:Running `../default/python.orig lib3/2to3/2to3 -f all lib/2to3` 1 time
[ 2/10] chameleon_v2...
INFO:root:Running `../default/python.pymem performance/bm_chameleon_v2.py -n 50 
--timer perf_counter`
INFO:root:Running `../default/python.orig performance/bm_chameleon_v2.py -n 50 
--timer perf_counter`
[ 3/10] django_v3...
INFO:root:Running `../default/python.pymem performance/bm_django_v3.py -n 50 
--timer perf_counter`
INFO:root:Running `../default/python.orig performance/bm_django_v3.py -n 50 
--timer perf_counter`
[ 4/10] fastpickle...
INFO:root:Running `../default/python.pymem performance/bm_pickle.py -n 50 
--timer perf_counter --use_cpickle pickle`
INFO:root:Running `../default/python.orig performance/bm_pickle.py -n 50 
--timer perf_counter --use_cpickle pickle`
[ 5/10] fastunpickle...
INFO:root:Running `../default/python.pymem performance/bm_pickle.py -n 50 
--timer perf_counter --use_cpickle unpickle`
INFO:root:Running `../default/python.orig performance/bm_pickle.py -n 50 
--timer perf_counter --use_cpickle unpickle`
[ 6/10] json_dump_v2...
INFO:root:Running `../default/python.pymem performance/bm_json_v2.py -n 50 
--timer perf_counter`
INFO:root:Running `../default/python.orig performance/bm_json_v2.py -n 50 
--timer perf_counter`
[ 7/10] json_load...
INFO:root:Running `../default/python.pymem performance/bm_json.py -n 50 --timer 
perf_counter json_load`
INFO:root:Running `../default/python.orig performance/bm_json.py -n 50 --timer 
perf_counter json_load`
[ 8/10] nbody...
INFO:root:Running `../default/python.pymem performance/bm_nbody.py -n 50 
--timer perf_counter`
INFO:root:Running `../default/python.orig performance/bm_nbody.py -n 50 --timer 
perf_counter`
[ 9/10] regex_v8...
INFO:root:Running `../default/python.pymem performance/bm_regex_v8.py -n 50 
--timer perf_counter`
INFO:root:Running `../default/python.orig performance/bm_regex_v8.py -n 50 
--timer perf_counter`
[10/10] tornado_http...
INFO:root:Running `../default/python.pymem performance/bm_tornado_http.py -n 
100 --timer perf_counter`
INFO:root:Running `../default/python.orig performance/bm_tornado_http.py -n 100 
--timer perf_counter`

Report on Linux smithers 4.3.3-300.fc23.x86_64 #1 SMP Tue Jan 5 23:31:01 UTC 
2016 x86_64 x86_64
Total CPU cores: 8

### 2to3 ###
6.880090 -> 6.818911: 1.01x faster

### fastpickle ###
Min: 0.453826 -> 0.442081: 1.03x faster
Avg: 0.456499 -> 0.443978: 1.03x faster
Significant (t=20.03)
Stddev: 0.00370 -> 0.00242: 1.5293x smaller

### fastunpickle ###
Min: 0.547908 -> 0.526027: 1.04x faster
Avg: 0.554663 -> 0.528686: 1.05x faster
Significant (t=15.95)
Stddev: 0.00893 -> 0.00728: 1.2260x smaller

### json_dump_v2 ###
Min: 2.733907 -> 2.627718: 1.04x faster
Avg: 2.762473 -> 2.664675: 1.04x faster
Significant (t=11.99)
Stddev: 0.03796 -> 0.04341: 1.1435x larger

### regex_v8 ###
Min: 0.042438 -> 0.042581: 1.00x slower
Avg: 0.042805 -> 0.044078: 1.03x slower
Significant (t=-2.12)
Stddev: 0.00171 -> 0.00388: 2.2694x larger

### tornado_http ###
Min: 0.254089 -> 0.246088: 1.03x faster
Avg: 0.257046 -> 0.249033: 1.03x faster
Significant (t=15.83)
Stddev: 0.00401 -> 0.00310: 1.2930x smaller

The following not significant results are hidden, use -v to show them:
chameleon_v2, django_v3, json_load, nbody.

real19m13.413s
user18m50.024s
sys 0m22.507s

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-02 Thread Antoine Pitrou

Antoine Pitrou added the comment:

> It looks like some benchmarks are up to 4% faster:

What this says is that some internals uses of PyMem_XXX should be replaced with 
PyObject_XXX.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-02 Thread Antoine Pitrou

Antoine Pitrou added the comment:

(of course, we might question why we have two different families of allocation 
APIs...)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-02 Thread STINNER Victor

Changes by STINNER Victor :


Added file: http://bugs.python.org/file41779/tu_malloc.c

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-02-02 Thread Yury Selivanov

Yury Selivanov added the comment:

> On Feb 2, 2016, at 7:00 AM, STINNER Victor  wrote:
> 
> So, I ran ssh://h...@hg.python.org/benchmarks with my patch. It looks like 
> some benchmarks are up to 4% faster:

Please use -r flag for perf.py

--
nosy: +Yury.Selivanov

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-01-31 Thread STINNER Victor

New submission from STINNER Victor:

The issue #23601 showed speedup for the dict type by replacing PyMem_Malloc() 
with PyObject_Malloc() in dictobject.c.

When I worked on the PEP 445, it was discussed to use the Python fast memory 
allocator for small memory allocations (<= 512 bytes), but I think that nobody 
tested on benchmark.

So I open an issue to discuss that.

By the way, we should also benchmark the Windows memory allocator which limits 
fragmentations. Maybe we can skip the Python small memory allocator on recent 
version of Windows?

Attached patch implements the change. The main question is the speedup on 
various kinds of memory allocations (need a benchmark) :-)

I will try to run benchmarks.

--

If the patch slows down Python, maybe we can investigate if some Python types 
(like dict) mostly uses "small" memory blocks (<= 512 bytes).

--
files: pymem.patch
keywords: patch
messages: 259290
nosy: haypo, rhettinger, serhiy.storchaka, yselivanov
priority: normal
severity: normal
status: open
title: Change PyMem_Malloc to use PyObject_Malloc allocator?
type: performance
versions: Python 3.6
Added file: http://bugs.python.org/file41767/pymem.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-01-31 Thread STINNER Victor

Changes by STINNER Victor :


--
nosy: +jtaylor

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26249] Change PyMem_Malloc to use PyObject_Malloc allocator?

2016-01-31 Thread STINNER Victor

STINNER Victor added the comment:

Ok, to avoid confusion, I opened an issue specific to Windows for its 
"Low-fragmentation Heap": issue #26251.

Other issues related to memory allocators.

Merged:

- issue #21233: Add *Calloc functions to CPython memory allocation API 
(extension of the PEP 445, asked by numpy)
- issue #13483: Use VirtualAlloc to allocate memory arenas (implementation of 
the PEP 445)
- issue #3329: API for setting the memory allocator used by Python

Open:

- issue #18835: Add aligned memory variants to the suite of PyMem 
functions/macros => this one is still open, the status is unclear :-/

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com