[issue21233] Add *Calloc functions to CPython memory allocation API

2014-06-02 Thread STINNER Victor

STINNER Victor added the comment:

I reread the issue. I hope that I now addressed all issues. The remaining 
issue, bytearray(int) is now tracked by the new issue #21644.

--
resolution:  -> fixed
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-06-02 Thread STINNER Victor

STINNER Victor added the comment:

"2) I'm not happy with the refactoring in bytearray_init(). (...)

3) Somewhat similarly, I wonder if it was necessary to refactor
   PyBytes_FromStringAndSize(). (...)"

Ok, I reverted the change on bytearray(int) and opened the issue #21644 to 
discuss these two optimizations.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-06-02 Thread Roundup Robot

Roundup Robot added the comment:

New changeset dff6b4b61cac by Victor Stinner in branch 'default':
Issue #21233: Revert bytearray(int) optimization using calloc()
http://hg.python.org/cpython/rev/dff6b4b61cac

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-06-02 Thread STINNER Victor

STINNER Victor added the comment:

"Okay, then let's please call it:
_PyObject_Calloc(void *ctx, size_t nobjs, size_t objsize)
_PyObject_Alloc(int use_calloc, void *ctx, size_t nobjs, size_t objsize)"

"void * PyMem_RawCalloc(size_t nelem, size_t elsize);" prototype comes from the 
POSIX standad:
http://pubs.opengroup.org/onlinepubs/009695399/functions/calloc.html

I'm don't want to change the prototype in Python. Extract of Python 
documentation:

.. c:function:: void* PyMem_RawCalloc(size_t nelem, size_t elsize)

   Allocates *nelem* elements each whose size in bytes is *elsize* (...)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-06-02 Thread STINNER Victor

STINNER Victor added the comment:

> I'm not sure: The usual case with ABI changes is that extensions may segfault 
> if they are *not* recompiled [1].

Ok, I renamed the structure PyMemAllocator to PyMemAllocatorEx, so the 
compilation fails because PyMemAllocator name is not defined. Modules compiled 
for Python 3.4 will crash on Python 3.5 if they are not recompiled, but I hope 
that you recompile your modules when you don't use the stable ABI.

Using PyMemAllocator is now more complex because it depends on the Python 
version. See for example the patch for pyfailmalloc:
https://bitbucket.org/haypo/pyfailmalloc/commits/9db92f423ac5f060d6ff499ee4bb74ebc0cf4761

Using the C preprocessor, it's possible to limit the changes.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-06-02 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 6374c2d957a9 by Victor Stinner in branch 'default':
Issue #21233: Rename the C structure "PyMemAllocator" to "PyMemAllocatorEx" to
http://hg.python.org/cpython/rev/6374c2d957a9

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-05-06 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 358a12f4d4bc by Victor Stinner in branch 'default':
Issue #21233: Fix _PyObject_Alloc() when compiled with WITH_VALGRIND defined
http://hg.python.org/cpython/rev/358a12f4d4bc

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-05-04 Thread Stefan Krah

Stefan Krah added the comment:

STINNER Victor  wrote:
> My final commit includes an addition to What's New in Python 3.5 doc,
> including a notice in the porting section. It is not enough?

I'm not sure: The usual case with ABI changes is that extensions may segfault
if they are *not* recompiled [1].  In that case documenting it in What's New is
standard procedure.

Here the extension *is* recompiled and still segfaults.

> Even if the API is public, the PyMemAllocator thing is low level. It's not
> part of the stable ABI. Except failmalloc, I don't know any user. I don't
> expect a lot of complain and it's easy to port the code.

Perhaps it's worth asking on python-dev. Nathaniel's suggestion isn't bad
either (e.g. name it PyMemAllocatorEx).

[1] I was told on python-dev that many people in fact do not recompile.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-05-03 Thread Stefan Krah

Stefan Krah added the comment:

STINNER Victor  wrote:
> PyObject_Malloc(100) asks to allocate one object of 100 bytes.

Okay, then let's please call it:

_PyObject_Calloc(void *ctx, size_t nobjs, size_t objsize)

_PyObject_Alloc(int use_calloc, void *ctx, size_t nobjs, size_t objsize)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-05-03 Thread STINNER Victor

STINNER Victor added the comment:

>"allocate nbytes elements of size 1"

PyObject_Malloc(100) asks to allocate one object of 100 bytes.

For PyMem_Malloc() and PyMem_RawMalloc(), it's more difficult to guess, but
IMO it's sane to bet that a single memory block of size bytes is requested.

I consider that char data[100] is a object of 100 bytes, but you call it
100 object of 1 byte.

I don't think that using nelem or elsize matters in practice.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-05-03 Thread Stefan Krah

Stefan Krah added the comment:

> > 5) If WITH_VALGRIND is defined, nbytes is uninitialized in
> _PyObject_Alloc().
> 
> Did you see my second commit? It's nlt already fixed?

I don't think so, I have revision 5d076506b3f5 here.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-05-03 Thread STINNER Victor

STINNER Victor added the comment:

> 5) If WITH_VALGRIND is defined, nbytes is uninitialized in
_PyObject_Alloc().

Did you see my second commit? It's nlt already fixed?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-05-03 Thread STINNER Victor

STINNER Victor added the comment:

> 6) We need some kind of prominent documentation that existing
>programs need to be changed:

My final commit includes an addition to What's New in Python 3.5 doc,
including a notice in the porting section. It is not enough?

Even if the API is public, the PyMemAllocator thing is low level. It's not
part of the stable ABI. Except failmalloc, I don't know any user. I don't
expect a lot of complain and it's easy to port the code.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-05-03 Thread Nathaniel Smith

Nathaniel Smith added the comment:

A simple solution would be to change the name of the struct, so that 
non-updated libraries will get a compile error instead of a runtime crash.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-05-03 Thread Stefan Krah

Stefan Krah added the comment:

Another thing:

6) We need some kind of prominent documentation that existing
   programs need to be changed:

Python 3.5.0a0 (default:62438d1b11c7+, May  3 2014, 23:35:03) 
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import failmalloc
>>> failmalloc.enable()
>>> bytes(1)
Segmentation fault (core dumped)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-05-03 Thread Stefan Krah

Stefan Krah added the comment:

I forgot one thing:

5) If WITH_VALGRIND is defined, nbytes is uninitialized in _PyObject_Alloc().

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-05-03 Thread Stefan Krah

Stefan Krah added the comment:

I did a post-commit review.  A couple of things:


1) I think Victor and I have a different view of the calloc() parameters.

  calloc(size_t nmemb, size_t size)

   If a memory region of bytes is allocated, IMO 'nbytes' should be in the
   place of 'nmemb' and '1' should be in the place of 'size'. That is,
   "allocate nbytes elements of size 1":

  calloc(nbytes, 1)


   In the commit the parameters are reversed in many places, which confuses
   me quite a bit, since it means "allocate one element of size nbytes".

  calloc(1, nbytes)


2) I'm not happy with the refactoring in bytearray_init(). I think it would
   be safer to make focused minimal changes in PyByteArray_Resize() instead.
   In fact, there is a behavior change which isn't correct:

Before:
===
>>> x = bytearray(0)
>>> m = memoryview(x)
>>> x.__init__(10)
Traceback (most recent call last):
  File "", line 1, in 
BufferError: Existing exports of data: object cannot be re-sized

 Now:
 
>>> x = bytearray(0)
>>> m = memoryview(x)
>>> x.__init__(10)
>>> x[0]
0
>>> m[0]
Traceback (most recent call last):
  File "", line 1, in 
IndexError: index out of bounds

3) Somewhat similarly, I wonder if it was necessary to refactor
   PyBytes_FromStringAndSize(). I find the new version more difficult
   to understand.


4) _PyObject_Alloc(): assert(nelem <= PY_SSIZE_T_MAX / elsize) can be called
   with elsize = 0.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-05-02 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 62438d1b11c7 by Victor Stinner in branch 'default':
Issue #21233: Oops, Fix _PyObject_Alloc(): initialize nbytes before going to
http://hg.python.org/cpython/rev/62438d1b11c7

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-05-02 Thread STINNER Victor

STINNER Victor added the comment:

Antoine Pitrou wrote:
>  The real use case I envision is with huge powers of two. If I write:
> x = 2 ** 100

I created the issue #21419 for this idea.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-05-02 Thread STINNER Victor

STINNER Victor added the comment:

> There is no need to hurry.

I changed my mind :-p It should be easier for numpy to test the development 
version of Python.

Let's wait for buildbots.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-05-02 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 5b0fda8f5718 by Victor Stinner in branch 'default':
Issue #21233: Add new C functions: PyMem_RawCalloc(), PyMem_Calloc(),
http://hg.python.org/cpython/rev/5b0fda8f5718

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-30 Thread STINNER Victor

STINNER Victor added the comment:

>  If you prefer to commit very soon,
> I promise to do a post commit review.

There is no need to hurry.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-30 Thread Stefan Krah

Stefan Krah added the comment:

Victor, sure, maybe not right away.  If you prefer to commit very soon,
I promise to do a post commit review.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-30 Thread STINNER Victor

STINNER Victor added the comment:

@Stefan: Can you please review calloc-6.patch? Charles-François wrote that the 
patch looks good, but for such critical operation (memory allocation), I would 
prefer a second review ;)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-29 Thread Charles-François Natali

Charles-François Natali added the comment:

LGTM!

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-29 Thread STINNER Victor

STINNER Victor added the comment:

Patch version 6:

- I renamed "int zero" parameter to "int use_calloc" and move the new parameter 
at the first position to avoid confusion with nelem. For example, 
_PyObject_Alloc(ctx, 1, nbytes, 0) becomes _PyObject_Alloc(0, ctx, 1, nbytes). 
It also more logical to put it in the first position. In bytesobject.c, I 
leaved it at the parameter at the end since its meaning is different (fill 
bytes with zero or not) IMO.

- I removed my hack (premature optimization) "assert(nelem == 1); ... 
malloc(elsize);" and replaced it with a less surprising "... malloc(nelem * 
elsize);"

Stefan & Charles-François: I hope that the patch looks better to you.

--
Added file: http://bugs.python.org/file35097/calloc-6.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-28 Thread Nathaniel Smith

Nathaniel Smith added the comment:

> It would be interesting to see some NumPy benchmarks (Nathaniel?).

What is it you want to see? NumPy already uses calloc; we benchmarked it when 
we added it and it made a huge difference to various realistic workloads :-). 
What NumPy gets out of this isn't calloc, it's access to tracemalloc.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-28 Thread Stefan Krah

Stefan Krah added the comment:

The order of the nelem/elsize matters for readability. Otherwise it is
not intuitive what happens after the jump to redirect in _PyObject_Alloc().

Why would you assert that 'nelem' is one?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-28 Thread STINNER Victor

STINNER Victor added the comment:

> Hmm, obmalloc.c changed as well, so already the gcc optimizer can take
> different paths and produce different results.

If decimal depends on allocator performances, you should maybe try to
implement a freelist.

> Also I did set mpd_callocfunc to PyMem_Calloc().

I don't understand. 2% slowdown is when you use calloc? Do you have the
same speed if you don't use calloc? According to my benchmarks, calloc is
slower if some bytes are modified later.

> The bytes() speedup is very nice. Allocations that took one second
> are practically instant now.

Is it really useful? Who need bytes(10**8) object?

Faster creation of bytearray(int) may be useful in real applications. I
really like bytearray and memoryview to avoid memory copies.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-28 Thread Charles-François Natali

Charles-François Natali added the comment:

> Also I did set mpd_callocfunc to PyMem_Calloc(). 2% slowdown is far
> from being a tragic result, so I guess we can ignore that.

Agreed.

> The bytes() speedup is very nice. Allocations that took one second
> are practically instant now.

Indeed.
Victor, thanks for the great work!

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-28 Thread Stefan Krah

Stefan Krah added the comment:

Hmm, obmalloc.c changed as well, so already the gcc optimizer can take
different paths and produce different results.

Also I did set mpd_callocfunc to PyMem_Calloc(). 2% slowdown is far
from being a tragic result, so I guess we can ignore that.

The bytes() speedup is very nice. Allocations that took one second
are practically instant now.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-28 Thread STINNER Victor

STINNER Victor added the comment:

> With the latest patch the decimal benchmark with a lot of small
> allocations is consistently 2% slower.

Does your benchmark use bytes(int) or bytearray(int)? If not, I guess that your 
benchmark is not reliable because only these two functions are changed by 
calloc-5.patch, except if there is a bug in my patch.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-28 Thread Stefan Krah

Stefan Krah added the comment:

With the latest patch the decimal benchmark with a lot of small
allocations is consistently 2% slower. Large factorials (where
the operands are initialized to zero for the number-theoretic
transform) have the same performance with and without the patch.

It would be interesting to see some NumPy benchmarks (Nathaniel?).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-28 Thread STINNER Victor

STINNER Victor added the comment:

Demo of calloc-5.patch on Linux. Thanks to calloc(), bytes(50 * 1024 * 1024) 
doesn't allocate memory for null bytes and so the RSS memory is unchanged (+148 
kB, not +50 MB), but tracemalloc says that 50 MB were allocated.

$ ./python -X tracemalloc
Python 3.5.0a0 (default:4b97092aa4bd+, Apr 28 2014, 10:40:53) 
[GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, tracemalloc
>>> os.system("grep RSS /proc/%s/status" % os.getpid())
VmRSS: 10736 kB
0
>>> before = tracemalloc.get_traced_memory()[0]
>>> large = bytes(50 * 1024 * 1024)
>>> import sys
>>> sys.getsizeof(large) / 1024.
51200.0478515625
>>> (tracemalloc.get_traced_memory()[0] - before) / 1024.
51198.1962890625
>>> os.system("grep RSS /proc/%s/status" % os.getpid())
VmRSS: 10884 kB
0

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-28 Thread STINNER Victor

STINNER Victor added the comment:

Patch version 5. This patch is ready for a review.

Summary of calloc-5.patch:

- add the following functions:

  * void* PyMem_RawCalloc(size_t nelem, size_t elsize)
  * void* PyMem_Calloc(size_t nelem, size_t elsize)
  * void* PyObject_Calloc(size_t nelem, size_t elsize)
  * PyObject* _PyObject_GC_Calloc(size_t basicsize)

- add "void* calloc(void *ctx, size_t nelem, size_t elsize)" field to the 
PyMemAllocator structure
- optimize bytes(n) and bytearray(n) to allocate objects using calloc() instead 
of malloc()
- update tracemalloc to trace also calloc()
- document new functions and add unit tests for the calloc "hook" (in _testcapi)


Changes between versions 4 and 5:

- revert all changes except bytes(n) and bytearray(n) of use_calloc.patch: they 
were useless according to benchmarks
- _PyObject_GC_Calloc() now takes a single parameter
- add versionadded and versionchanged fields in the documentation


According to benchmarks, calloc() is only useful for large allocation (1 MB?) 
if only a part of the memory block is modified (to non-zero bytes) just after 
the allocation. Untouched memory pages don't use physical memory and don't use 
RSS memory pages, but it is possible to read their content (null bytes). Using 
calloc() instead of malloc()+memset(0) doens't look to be faster (it may be a 
little bit slower) if all bytes are set just after the allocation.

I chose to only use one parameter for _PyObject_GC_Calloc() because this 
function is used to allocate Python objects. A structure of a Python object 
must start with PyObject_HEAD or PyObject_VAR_HEAD and so the total size of an 
object cannot be expressed as NELEM * ELEMSIZE.

I have no use case for _PyObject_GC_Calloc(), but it makes sense to use it to 
allocate a large Python object tracked by the GC and using a single memory 
block for the Python header + data.

PyObject_Calloc() simply use memset(0) for small objects (<= 512 bytes). It 
delegates the allocation to PyMem_RawCalloc(), and so indirectly to calloc(), 
for larger objects.

Note: use_calloc.patch is no more needed, I merged the two patches since only 
bytes(n) and bytearray(n) now use calloc().

--
Added file: http://bugs.python.org/file35072/calloc-5.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-28 Thread STINNER Victor

STINNER Victor added the comment:

Changes on the pickle module don't look like an interesting optimization. It 
even looks slower.

$ python perf.py -b 
fastpickle,fastunpickle,pickle,pickle_dict,pickle_list,slowpickle,slowunpickle,unpickle
 ../default/python.orig ../default/python.calloc
...

Report on Linux selma 3.13.9-200.fc20.x86_64 #1 SMP Fri Apr 4 12:13:05 UTC 2014 
x86_64 x86_64
Total CPU cores: 4

### fastpickle ###
Min: 0.364510 -> 0.374144: 1.03x slower
Avg: 0.367882 -> 0.377714: 1.03x slower
Significant (t=-11.54)
Stddev: 0.00493 -> 0.00347: 1.4209x smaller

The following not significant results are hidden, use -v to show them:
fastunpickle, pickle_dict, pickle_list.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-28 Thread STINNER Victor

STINNER Victor added the comment:

It looks like Windows supports also lazy initialization of memory pages 
initialized to zero.

According to my microbenchmark on Linux and Windows, only bytes(n) and 
bytearray(n) are really faster with use_calloc.patch. Most changes of 
use_calloc.patch are maybe useless since all bytes are initilized to zero, but 
just after that they are replaced with new bytes.

Results of bench_alloc2.py on Windows 7: original vs 
calloc-4.patch+use_calloc.patch:

Common platform:
Timer: time.perf_counter
Python unicode implementation: PEP 393
Bits: int=32, long=32, long long=64, size_t=32, void*=32
Platform: Windows-7-6.1.7601-SP1
CFLAGS: None
Timer info: namespace(adjustable=False, implementation='QueryPerformanceCounter(
)', monotonic=True, resolution=1e-08)

Platform of campaign orig:
SCM: hg revision=4b97092aa4bd branch=default date="2014-04-27 18:02 +0100"
Date: 2014-04-28 09:35:30
Python version: 3.5.0a0 (default, Apr 28 2014, 09:33:30) [MSC v.1600 32 bit (Int
el)]
Timer precision: 4.47 us

Platform of campaign calloc:
SCM: hg revision=4f0aaa8804c6 tag=tip branch=default date="2014-04-28 09:27 +020
0"
Date: 2014-04-28 09:37:37
Python version: 3.5.0a0 (default:4f0aaa8804c6, Apr 28 2014, 09:37:03) [MSC v.160
0 32 bit (Intel)]
Timer precision: 4.44 us

---+-+
Tests  |    orig |  calloc
---+-+
object()   |  121 ns (*) |   109 ns (-10%)
b'A' * 10  |   77 ns (*) |   79 ns
b'A' * 10**3   |  159 ns (*) |    168 ns (+5%)
b'A' * 10**6   |  428 us (*) |  415 us
'A' * 10   |   87 ns (*) |   89 ns
'A' * 10**3    |  175 ns (*) |  177 ns
'A' * 10**6    |  429 us (*) |    454 us (+6%)
'A' * 10**8    | 48.4 ms (*) |   49 ms
(None,) * 10**0    |   49 ns (*) |   51 ns
(None,) * 10**1    |  115 ns (*) |    99 ns (-14%)
(None,) * 10**2    |  433 ns (*) |  422 ns
(None,) * 10**3    | 3.58 us (*) | 3.57 us
(None,) * 10**4    | 34.9 us (*) | 34.9 us
(None,) * 10**5    |  347 us (*) |  351 us
(None,) * 10**6    | 5.14 ms (*) |   4.85 ms (-6%)
(None,) * 10**7    | 53.2 ms (*) |   50.2 ms (-6%)
(None,) * 10**8    |  563 ms (*) |    515 ms (-9%)
([None] * 10)[1:-1]    |  217 ns (*) |  217 ns
([None] * 10**3)[1:-1] | 3.89 us (*) | 3.92 us
([None] * 10**6)[1:-1] | 5.13 ms (*) | 5.17 ms
([None] * 10**8)[1:-1] |  634 ms (*) |   533 ms (-16%)
bytes(10)  |  193 ns (*) |    206 ns (+7%)
bytes(10**3)   |  266 ns (*) |   296 ns (+12%)
bytes(10**6)   |  414 us (*) |  3.89 us (-99%)
bytes(10**8)   | 44.2 ms (*) | 4.56 us (-100%)
bytearray(10)  |  229 ns (*) |    243 ns (+6%)
bytearray(10**3)   |  301 ns (*) |   330 ns (+10%)
bytearray(10**6)   |  421 us (*) |  3.89 us (-99%)
bytearray(10**8)   | 44.4 ms (*) | 4.56 us (-100%)
---+-+
Total  | 1.4 sec (*) | 1.16 sec (-17%)
---+-+

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread STINNER Victor

STINNER Victor added the comment:

> The real use case I envision is with huge powers of two.

I'm not sure that it's a common use case, but it can be nice to optimize this 
case if it doesn't make longobject.c more complex. It looks like calloc() 
becomes interesting for objects larger than 1 MB.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Antoine Pitrou

Antoine Pitrou added the comment:

> Ok, now the real use case where it becomes faster: I implemented the
> same optimization for bytearray.

The real use case I envision is with huge powers of two. If I write:

  x = 2 ** 100

then all of x's bytes except the highest one will be zeros. If we map those to 
/dev/zero, it will be a massive saving for programs using huge powers of two.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread STINNER Victor

STINNER Victor added the comment:

> Don't hesitate to rerun my benchmark on more different platforms?

Oops, I wanted to write ";-)" not "?".

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread STINNER Victor

STINNER Victor added the comment:

> Are you sure this is a good platform for performance reports? :)

Don't hesitate to rerun my benchmark on more different platforms?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Antoine Pitrou

Antoine Pitrou added the comment:

> Common platform:
> Timer: time.perf_counter
> Timer info: namespace(adjustable=False, 
> implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, 
> resolution=1e-09)
> Platform: Linux-3.13.9-200.fc20.x86_64-x86_64-with-fedora-20-Heisenbug
   ^
Are you sure this is a good platform for performance reports? :)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread STINNER Victor

STINNER Victor added the comment:

bench_alloc2.py: updated benchmark script. I added bytes(n) and bytearray(n) 
tests and removed the test decoding from ASCII.

Common platform:
Timer: time.perf_counter
Timer info: namespace(adjustable=False, 
implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, 
resolution=1e-09)
Platform: Linux-3.13.9-200.fc20.x86_64-x86_64-with-fedora-20-Heisenbug
SCM: hg revision=4b97092aa4bd+ tag=tip branch=default date="2014-04-27 18:02 
+0100"
Python unicode implementation: PEP 393
Bits: int=32, long=64, long long=64, size_t=64, void*=64
CFLAGS: -Wno-unused-result -Werror=declaration-after-statement -DNDEBUG -g 
-fwrapv -O3 -Wall -Wstrict-prototypes
CPU model: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz

Platform of campaign orig:
Date: 2014-04-28 01:11:49
Timer precision: 39 ns
Python version: 3.5.0a0 (default:4b97092aa4bd, Apr 28 2014, 01:02:01) [GCC 
4.8.2 20131212 (Red Hat 4.8.2-7)]

Platform of campaign calloc:
Date: 2014-04-28 01:12:29
Timer precision: 44 ns
Python version: 3.5.0a0 (default:4b97092aa4bd+, Apr 28 2014, 01:06:54) [GCC 
4.8.2 20131212 (Red Hat 4.8.2-7)]

---+-+
Tests  |    orig |  calloc
---+-+
object()   |   62 ns (*) |    72 ns (+16%)
b'A' * 10  |   53 ns (*) |   52 ns
b'A' * 10**3   |   96 ns (*) |   110 ns (+15%)
b'A' * 10**6   | 38.5 us (*) | 38.6 us
'A' * 10   |   59 ns (*) |   61 ns
'A' * 10**3    |  105 ns (*) |  108 ns
'A' * 10**6    | 38.6 us (*) | 38.6 us
'A' * 10**8    | 10.3 ms (*) | 10.4 ms
(None,) * 10**0    |   29 ns (*) |   29 ns
(None,) * 10**1    |   75 ns (*) |   76 ns
(None,) * 10**2    |  432 ns (*) |    461 ns (+7%)
(None,) * 10**3    | 3.58 us (*) |  3.6 us
(None,) * 10**4    | 35.8 us (*) | 35.7 us
(None,) * 10**5    |  365 us (*) |  365 us
(None,) * 10**6    |  4.1 ms (*) | 4.13 ms
(None,) * 10**7    | 43.6 ms (*) |   40.3 ms (-8%)
(None,) * 10**8    |  433 ms (*) |    401 ms (-7%)
([None] * 10)[1:-1]    |  122 ns (*) |   134 ns (+10%)
([None] * 10**3)[1:-1] |  3.6 us (*) | 3.62 us
([None] * 10**6)[1:-1] | 4.22 ms (*) |  4.2 ms
([None] * 10**8)[1:-1] |  441 ms (*) |    402 ms (-9%)
bytes(10)  |  137 ns (*) |  136 ns
bytes(10**3)   |  181 ns (*) |    191 ns (+5%)
bytes(10**6)   | 38.7 us (*) | 39.2 us
bytes(10**8)   | 10.3 ms (*) | 4.36 us (-100%)
bytearray(10)  |  138 ns (*) |   153 ns (+11%)
bytearray(10**3)   |  184 ns (*) |   211 ns (+14%)
bytearray(10**6)   | 38.7 us (*) | 39.3 us
bytearray(10**8)   | 10.3 ms (*) | 4.32 us (-100%)
---+-+
Total  |  957 ms (*) |   862 ms (-10%)
---+-+

--
Added file: http://bugs.python.org/file35065/bench_alloc2.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread STINNER Victor

STINNER Victor added the comment:

I splitted my patch into two parts:

- calloc-4.patch: add new "Calloc" functions including _PyObject_GC_Calloc()
- use_calloc.patch: patch types (bytes, dict, list, set, tuple, etc.) and 
various modules to use calloc

I reverted my changes on _PyObject_GC_Malloc() and added _PyObject_GC_Calloc(), 
performance regressions are gone. Creating a large tuple is a little bit (8%) 
faster. But the real speedup is to build a large bytes strings of null bytes:


$ ./python.orig -m timeit 'bytes(50*1024*1024)'
100 loops, best of 3: 5.7 msec per loop
$ ./python.calloc -m timeit 'bytes(50*1024*1024)'
10 loops, best of 3: 4.12 usec per loop

On Linux, no memory is allocated, even if you read the bytes content. RSS is 
almost unchanged.

Ok, now the real use case where it becomes faster: I implemented the same 
optimization for bytearray.

$ ./python.orig -m timeit 'bytearray(50*1024*1024)'
100 loops, best of 3: 6.33 msec per loop
$ ./python.calloc -m timeit 'bytearray(50*1024*1024)'
10 loops, best of 3: 4.09 usec per loop

If you overallocate a bytearray and only write a few bytes, the bytes of end of 
bytearray will not be allocated (at least on Linux).


Result of bench_alloc.py comparing original Python to patched Python 
(calloc-4.patch + use_calloc.patch).

Common platform:
SCM: hg revision=4b97092aa4bd+ tag=tip branch=default date="2014-04-27 18:02 
+0100"
Timer info: namespace(adjustable=False, 
implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, 
resolution=1e-09)
Python unicode implementation: PEP 393
CFLAGS: -Wno-unused-result -Werror=declaration-after-statement -DNDEBUG -g 
-fwrapv -O3 -Wall -Wstrict-prototypes
Bits: int=32, long=64, long long=64, size_t=64, void*=64
Timer: time.perf_counter
CPU model: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz
Platform: Linux-3.13.9-200.fc20.x86_64-x86_64-with-fedora-20-Heisenbug

Platform of campaign orig:
Timer precision: 42 ns
Date: 2014-04-28 00:27:19
Python version: 3.5.0a0 (default:4b97092aa4bd, Apr 28 2014, 00:24:03) [GCC 
4.8.2 20131212 (Red Hat 4.8.2-7)]

Platform of campaign calloc:
Timer precision: 54 ns
Date: 2014-04-28 00:28:35
Python version: 3.5.0a0 (default:4b97092aa4bd+, Apr 28 2014, 00:25:56) [GCC 
4.8.2 20131212 (Red Hat 4.8.2-7)]

---+-+--
Tests  |    orig |    calloc
---+-+--
object()   |   61 ns (*) |  71 ns (+16%)
b'A' * 10  |   54 ns (*) | 52 ns
b'A' * 10**3   |  124 ns (*) | 110 ns (-12%)
b'A' * 10**6   | 38.4 us (*) |   38.5 us
'A' * 10   |   59 ns (*) | 62 ns
'A' * 10**3    |  132 ns (*) | 107 ns (-19%)
'A' * 10**6    | 38.5 us (*) |   38.5 us
'A' * 10**8    | 10.3 ms (*) |   10.6 ms
decode 10 null bytes from ASCII    |  264 ns (*) |    263 ns
decode 10**3 null bytes from ASCII |  403 ns (*) |  379 ns (-6%)
decode 10**6 null bytes from ASCII | 80.5 us (*) |   80.5 us
decode 10**8 null bytes from ASCII | 17.7 ms (*) |   17.3 ms
(None,) * 10**0    |   29 ns (*) | 28 ns
(None,) * 10**1    |   75 ns (*) | 76 ns
(None,) * 10**2    |  461 ns (*) |    460 ns
(None,) * 10**3    |  3.6 us (*) |   3.57 us
(None,) * 10**4    | 35.7 us (*) |   35.7 us
(None,) * 10**5    |  364 us (*) |    365 us
(None,) * 10**6    | 4.12 ms (*) |   4.11 ms
(None,) * 10**7    | 43.5 ms (*) | 40.3 ms (-7%)
(None,) * 10**8    |  433 ms (*) |  400 ms (-8%)
([None] * 10)[1:-1]    |  121 ns (*) | 134 ns (+11%)
([None] * 10**3)[1:-1] | 3.62 us (*) |   3.61 us
([None] * 10**6)[1:-1] | 4.24 ms (*) |   4.22 ms
([None] * 10**8)[1:-1] |  440 ms (*) |  402 ms (-9%)
---+-+--
Total  |  954 ms (*) |  880 ms (-8%)
---+-+--

--
Added file: http://bugs.python.org/file35063/calloc-4.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread STINNER Victor

Changes by STINNER Victor :


Added file: http://bugs.python.org/file35064/use_calloc.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Charles-François Natali

Charles-François Natali added the comment:

>> Hm...
>> What's /proc/sys/vm/overcommit_memory ?
>> If it's set to 0, then the kernel will always overcommit.
>
> Ah, indeed.

See above, I mistyped: 0 is the default (which is already quite
optimistic), 1 is always.

>> If you set it to 2, normally you'd definitely get ENOMEM
>
> You're right, but with weird results:
>
> $ gcc -o /tmp/test test.c; /tmp/test
> malloc() returned NULL after 600MB
> $ gcc -DDO_MEMSET -o /tmp/test test.c; /tmp/test
> malloc() returned NULL after 600MB
>
> (I'm supposed to have gigabytes free?!)

The formula is RAM * vm.overcommit_ratio /100 + swap

So if you don't have swap, or a low overcommit_ratio, it could explain
why it returns so early.
Or maybe you have some processes with a lot of mapped-yet-unused
memory (chromium is one of those for example).

Anyway, it's really a mess!

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Antoine Pitrou

Antoine Pitrou added the comment:

> Hm...
> What's /proc/sys/vm/overcommit_memory ?
> If it's set to 0, then the kernel will always overcommit.

Ah, indeed.

> If you set it to 2, normally you'd definitely get ENOMEM

You're right, but with weird results:

$ gcc -o /tmp/test test.c; /tmp/test
malloc() returned NULL after 600MB
$ gcc -DDO_MEMSET -o /tmp/test test.c; /tmp/test
malloc() returned NULL after 600MB

(I'm supposed to have gigabytes free?!)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Charles-François Natali

Charles-François Natali added the comment:

> Hm...
> What's /proc/sys/vm/overcommit_memory ?
> If it's set to 0, then the kernel will always overcommit.

I meant 1 (damn, I need sleep).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Charles-François Natali

Charles-François Natali added the comment:

> Both OOM here (3.11.0-20-generic, 64-bit, Ubuntu).

Hm...
What's /proc/sys/vm/overcommit_memory ?
If it's set to 0, then the kernel will always overcommit.

If you set it to 2, normally you'd definitely get ENOMEM (which is IMO
much nicer than getting nuked by the OOM killer, especially because,
like in real life, there's often collateral damage ;-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Stefan Krah

Stefan Krah added the comment:

This is probably offtopic, but I think people who want reliable
MemoryErrors can use limits, e.g. via djb's softlimit (daemontools):

$ softlimit -m 1 ./python
Python 3.5.0a0 (default:462470859e57+, Apr 27 2014, 19:34:06)
[GCC 4.7.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> [i for i in range(999)]
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 1, in 
MemoryError

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Antoine Pitrou

Antoine Pitrou added the comment:

> $ gcc -o /tmp/test /tmp/test.c; /tmp/test
> malloc() returned NULL after 3050MB
> $ gcc -DDO_MEMSET -o /tmp/test /tmp/test.c; /tmp/test
> malloc() returned NULL after 2130MB
> 
> Without memset, the kernel happily allocates until we reach the 3GB
> user address space limit.
> With memset, it bails out way before.
> 
> I don't know what this'll give on 64-bit, but I assume one should get
> comparable result.

Both OOM here (3.11.0-20-generic, 64-bit, Ubuntu).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Charles-François Natali

Charles-François Natali added the comment:

> So yeah, touching pages can affect whether a later malloc returns ENOMEM.
>
> I'm not sure any of this actually matters in the Python case though :-). 
> There's still no reason to go touching pages pre-emptively just in case we 
> might write to them later -- all that does is increase the interpreter's 
> memory footprint, which can't help anything. If people are worried about 
> overcommit, then they should turn off overcommit, not try and disable it on a 
> piece-by-piece basis by trying to get individual programs to memory before 
> they need it.

Absolutely: that's why I'm really in favor of exposing calloc, this
could definitely help many workloads.

Victor, did you run any non-trivial benchmark, like pybench & Co?

As I said, I'm not expecting any improvement, I just want to make sure
there's not hidden regression somewhere (like the one for GC-tracked
objects above).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Charles-François Natali

Charles-François Natali added the comment:

Alright, it bothered me so I wrote a small C testcase (attached),
which calls malloc in a loop, and can call memset upon the allocated
block right after allocation:

$ gcc -o /tmp/test /tmp/test.c; /tmp/test
malloc() returned NULL after 3050MB
$ gcc -DDO_MEMSET -o /tmp/test /tmp/test.c; /tmp/test
malloc() returned NULL after 2130MB

Without memset, the kernel happily allocates until we reach the 3GB
user address space limit.
With memset, it bails out way before.

I don't know what this'll give on 64-bit, but I assume one should get
comparable result.

I would guess that the reason why the Python list allocation fails is
because of the exponential allocation scheme: since memory is
allocated in large chunks before being used, the kernel happily
overallocates.
With a more progressive allocation+usage, it should return ENOMEM at some point.

Anyway, that's probably off-topic!

--
Added file: http://bugs.python.org/file35059/test.c

___
Python tracker 

___#include 
#include 
#include 


#define BLOCK_SIZE (10*1024*1024)


int main(int argc, char *argv[])
{
unsigned long size = 0;
char *p;

while ((p = malloc(BLOCK_SIZE)) != NULL) {
#ifdef DO_MEMSET
memset(p, 0, BLOCK_SIZE);
#endif
size += BLOCK_SIZE;
}

printf("malloc() returned NULL after %uMB\n", (size/(1024*1024)));

exit(EXIT_SUCCESS);
}
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Right, python3 -c 'b"x" * (2 ** 48)' does give an instant MemoryError for me. 
So I was wrong about it being the VM limit indeed.

The documentation on this is terrible! But, if I'm reading this right:
   http://lxr.free-electrons.com/source/mm/util.c#L434
the actual rules are:

overcommit mode 1: allocating a VM range always succeeds.
overcommit mode 2: (Slightly simplified) You can allocate total VM ranges up to 
(swap + RAM * overcommit_ratio), and overcommit_ratio is 50% by default. So 
that's a bit odd, but whatever. This is still entirely a limit on VM size.
overcommit mode 0 ("guess", the default): when allocating a VM range, the 
kernel imagines what would happen if you immediately used all those pages. If 
that would put you OOM, then we fall back to mode 2 rules. If that would *not* 
put you OOM, then the allocation unconditionally succeeds.

So yeah, touching pages can affect whether a later malloc returns ENOMEM.

I'm not sure any of this actually matters in the Python case though :-). 
There's still no reason to go touching pages pre-emptively just in case we 
might write to them later -- all that does is increase the interpreter's memory 
footprint, which can't help anything. If people are worried about overcommit, 
then they should turn off overcommit, not try and disable it on a 
piece-by-piece basis by trying to get individual programs to memory before they 
need it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Charles-François Natali

Charles-François Natali added the comment:

Dammit, read:

python -c 'b"x" * (2**48)'

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Charles-François Natali

Charles-François Natali added the comment:

> And your test.py produces the same result. Are you sure you don't have a 
> ulimit set on address space?

Yep, I'm sure:
$  ulimit -v
unlimited

It's probably due to the exponential over-allocation used by the array
(to guarantee amortized constant cost).

How about:
python -c "b = bytes('x' * )"

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Nathaniel Smith

Nathaniel Smith added the comment:

On my laptop (x86-64, Linux 3.13, 12 GB RAM):

$ python3 -c "[i for i in range(9)]"
zsh: killed python3 -c "[i for i in range(9)]"

$ dmesg | tail -n 2
[404714.401901] Out of memory: Kill process 10752 (python3) score 687 or 
sacrifice child
[404714.401903] Killed process 10752 (python3) total-vm:17061508kB, 
anon-rss:10559004kB, file-rss:52kB

And your test.py produces the same result. Are you sure you don't have a ulimit 
set on address space?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Antoine Pitrou

Antoine Pitrou added the comment:

> Just try python -c "[i for i in
> range()]" on a 64-bit machine, I'll bet you'll get a
> MemoryError (ENOMEM).

Hmm, I get an OOM kill here.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Charles-François Natali

Charles-François Natali added the comment:

> @Charles-François: I think your worries about calloc and overcommit are 
> unjustified. First, calloc and malloc+memset actually behave the same way 
> here -- with a large allocation and overcommit enabled, malloc and calloc 
> will both go ahead and return the large allocation, and then the actual 
> out-of-memory (OOM) event won't occur until the memory is accessed. In the 
> malloc+memset case this access will occur immediately after the malloc, 
> during the memset -- but this is still too late for us to detect the malloc 
> failure.

Not really: what you describe only holds for a single object.
But if you allocate let's say 1000 such objects at once:
- in the malloc + memset case, the committed pages are progressively
accessed (i.e. the pages for object N are accessed before the memory
is allocated for object N+1), so they will be counted not only as
committed, but also as active (for example the RSS will increase
gradually): so at some point, even though by default the Linux VM
subsystem is really lenient toward overcommitting, you'll likely have
malloc/mmap return NULL because of this
- in the calloc() case, all the memory is first committed, but not
touched: the kernel will likely happily overcommit all of this. Only
when you start progressively accessing the pages will the OOM kick in.

> Second, OOM does not cause segfaults on any system I know. On Linux it wakes 
> up the OOM killer, which shoots some random (possibly guilty) process in the 
> head. The actual program which triggered the OOM is quite likely to escape 
> unscathed.

Ah, did I say segfault?
Sorry, I of course meant that the process will get nuked by the OOM killer.

> In practice, the *only* cases where you can get a MemoryError on modern 
> systems are (a) if the user has turned overcommit off, (b) you're on a tiny 
> embedded system that doesn't have overcommit, (c) if you run out of virtual 
> address space. None of these cases are affected by the differences between 
> malloc and calloc.

That's a common misconception: provided that the memory allocated is
accessed progressively (see above point), you'll often get ENOMEM,
even with overcommitting:

$ /sbin/sysctl -a | grep overcommit
vm.nr_overcommit_hugepages = 0
vm.overcommit_memory = 0
vm.overcommit_ratio = 50

$ cat /tmp/test.py
l = []

with open('/proc/self/status') as f:
try:
for i in range(5000):
l.append(i)
except MemoryError:
for line in f:
if 'VmPeak' in line:
print(line)
raise

$ python /tmp/test.py
VmPeak:   720460 kB

Traceback (most recent call last):
  File "/tmp/test.py", line 7, in 
l.append(i)
MemoryError

I have a 32-bit machine, but the process definitely has more than
720MB of address space ;-)

If your statement were true, this would mean that it's almost
impossible to get ENOMEM with overcommitting on a 64-bit machine,
which is - fortunately - not true. Just try python -c "[i for i in
range()]" on a 64-bit machine, I'll bet you'll get a
MemoryError (ENOMEM).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Nathaniel Smith

Nathaniel Smith added the comment:

@Charles-François: I think your worries about calloc and overcommit are 
unjustified. First, calloc and malloc+memset actually behave the same way here 
-- with a large allocation and overcommit enabled, malloc and calloc will both 
go ahead and return the large allocation, and then the actual out-of-memory 
(OOM) event won't occur until the memory is accessed. In the malloc+memset case 
this access will occur immediately after the malloc, during the memset -- but 
this is still too late for us to detect the malloc failure. Second, OOM does 
not cause segfaults on any system I know. On Linux it wakes up the OOM killer, 
which shoots some random (possibly guilty) process in the head. The actual 
program which triggered the OOM is quite likely to escape unscathed. In 
practice, the *only* cases where you can get a MemoryError on modern systems 
are (a) if the user has turned overcommit off, (b) you're on a tiny embedded 
system that doesn't have overcommit, (c) if you run out of virtual address 
space. None of these cases are affected by the differences between malloc and 
calloc.

Regarding the calloc API: it's a wart, but it seems like a pretty unavoidable 
wart at this point, and the API compatibility argument is strong. I think we 
should just keep the two argument form and live with it...

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread STINNER Victor

STINNER Victor added the comment:

"Because if a code creates many such objects which basically just do
calloc(), on operating systems with memory overommitting (such as
Linux), the calloc() allocations will pretty much always succeed, but
will segfault when the page is first written to in case of low memory."

Overcommit leads to segmentation fault when there is no more memory, but I 
don't see how calloc() is worse then malloc()+memset(0). It will crash in both 
cases, no?

In my experience (embedded device with low memory), programs crash because they 
don't check the result of malloc() (return NULL on allocation failure), not 
because of overcommit.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread STINNER Victor

STINNER Victor added the comment:

list: items are allocated in a second memory block. PyList_New() uses memset(0) 
to set all items to NULL.

tuple: header and items are stored in a single structure (PyTupleObject), in a 
single memory block. PyTuple_New() fills the items will NULL (so write again 
null bytes). Something can be optimized here.

dict: header, keys and values are stored in 3 different memory blocks. It may 
be interesting to use calloc() to allocate keys and values. Initialization of 
keys and values to NULL uses a dummy loop. I expect that memset(0) would be 
faster.

Anyway, I expect that all items of builtin containers (tuple, list, dict, etc.) 
are set to non-NULL values. So the lazy initialization to zeros may be useless 
for them.

It means that benchmarking builtin containers should not show any speedup. 
Something else (numpy?) should be used to see an interesting speedup.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Charles-François Natali

Charles-François Natali added the comment:

> __libc_calloc() starts with a check on integer overflow.

Yes, see my previous message:
"""
AFAICT, the two arguments are purely historical (it was used when
malloc() didn't guarantee suitable alignment, and has the advantage of
performing overflow check when doing the multiplication, but in our
code we always check for it anyway).
"""

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread STINNER Victor

STINNER Victor added the comment:

"And 
http://www.eglibc.org/cgi-bin/viewvc.cgi/trunk/libc/malloc/malloc.c?view=markup
to check that calloc(nelem, elsize) is implemented as calloc(nelem *
elsize)"

__libc_calloc() starts with a check on integer overflow.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Charles-François Natali

Charles-François Natali added the comment:

> It looks like calloc-3.patch is wrong: it modify _PyObject_GC_Malloc() to 
> fill the newly allocated buffer with zeros, but _PyObject_GC_Malloc() is not 
> only called by PyType_GenericAlloc(): it is also used by _PyObject_GC_New() 
> and _PyObject_GC_NewVar(). The patch is maybe a little bit slower because it 
> writes zeros twice.

Exactly (sorry, I thought you'd already seen that, otherwise I could
have told you!)

> Actually, I think we have to match the C-API:  For instance, in
> Modules/_decimal/_decimal.c:5527 the libmpdec allocators are
> set to the Python allocators.

Hmm, ok then, I didn't know we were plugging our allocators for
external libraries: that's indeed a very good reason to keep the same
prototype.

But I still find this API cumbersome: calloc is exactly like malloc
except for the zeroing, so the prototype could be simpler (a quick
look at Victor's patch shows a lot of calloc(1, n), which is a sign
something's wrong). Maybe it's just me ;-)

Otherwise, a random thought: by changing PyType_GenericAlloc() from
malloc() + memset(0) to calloc(), there could be a subtle side effect:
if a given type relies on the 0-setting (which is documented), and
doesn't do any other work on the allocated area behind the scenes
(think about a mmap-like object), we could lose our capacity to detect
MemoryError, and run into segfaults instead.

Because if a code creates many such objects which basically just do
calloc(), on operating systems with memory overommitting (such as
Linux), the calloc() allocations will pretty much always succeed, but
will segfault when the page is first written to in case of low memory.

I don't think such use cases should be common: I would expect most
types to use tp_alloc(type, 0) and then use an internal additional
pointer for the allocations it needs, or immediately write to the
allocated memory area right after allocation, but that's something to
keep in mind.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Stefan Krah

Stefan Krah added the comment:

Actually, I think we have to match the C-API:  For instance, in
Modules/_decimal/_decimal.c:5527 the libmpdec allocators are
set to the Python allocators.

So I'd need to do:

mpd_callocfunc = PyMem_Calloc;


I suppose that's a common use case.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread STINNER Victor

STINNER Victor added the comment:

It looks like calloc-3.patch is wrong: it modify _PyObject_GC_Malloc() to fill 
the newly allocated buffer with zeros, but _PyObject_GC_Malloc() is not only 
called by PyType_GenericAlloc(): it is also used by _PyObject_GC_New() and 
_PyObject_GC_NewVar(). The patch is maybe a little bit slower because it writes 
zeros twice.

calloc.patch adds "PyObject* _PyObject_GC_Calloc(size_t);" and doesn't have 
this issue.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Stefan Krah

Stefan Krah added the comment:

Just to add another data point, I don't find the calloc() API
cumbersome.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Antoine Pitrou

Antoine Pitrou added the comment:

> Regarding the *Calloc functions: how about we provide a sane API
> instead of reproducing the cumbersome C API?

Isn't the point of reproducing the C API to allow quickly switching from 
calloc() to PyObject_Calloc()?
(besides, it seems the OpenBSD guys like the two-argument form :-))

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread STINNER Victor

STINNER Victor added the comment:

I wrote a short microbenchmark allocating objects using my benchmark.py script.

It looks like the operation "(None,) * N" is slower with calloc-3.patch, but 
it's unclear how much times slower it is. I don't understand why only this 
operation has different speed.

Do you have ideas for other benchmarks?

Using the timeit module:

$ ./python.orig -m timeit '(None,) * 10**5'
1000 loops, best of 3: 357 usec per loop
$ ./python.calloc -m timeit '(None,) * 10**5'
1000 loops, best of 3: 698 usec per loop

But with different parameters, the difference is lower:

$ ./python.orig -m timeit -r 20 -n '1000' '(None,) * 10**5'
1000 loops, best of 20: 362 usec per loop
$ ./python.calloc -m timeit -r 20 -n '1000' '(None,) * 10**5'
1000 loops, best of 20: 392 usec per loop


Results of bench_alloc.py:

Common platform:
CFLAGS: -Wno-unused-result -Werror=declaration-after-statement -DNDEBUG -g 
-fwrapv -O3 -Wall -Wstrict-prototypes
CPU model: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
Python unicode implementation: PEP 393
Timer info: namespace(adjustable=False, 
implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, 
resolution=1e-09)
Timer: time.perf_counter
SCM: hg revision=462470859e57+ branch=default date="2014-04-26 19:01 -0400"
Platform: Linux-3.13.8-200.fc20.x86_64-x86_64-with-fedora-20-Heisenbug
Bits: int=32, long=64, long long=64, size_t=64, void*=64

Platform of campaign orig:
Timer precision: 42 ns
Date: 2014-04-27 12:27:26
Python version: 3.5.0a0 (default:462470859e57, Apr 27 2014, 11:52:55) [GCC 
4.8.2 20131212 (Red Hat 4.8.2-7)]

Platform of campaign calloc:
Timer precision: 45 ns
Date: 2014-04-27 12:29:10
Python version: 3.5.0a0 (default:462470859e57+, Apr 27 2014, 12:04:57) [GCC 
4.8.2 20131212 (Red Hat 4.8.2-7)]

---+--+---
Tests  | orig | calloc
---+--+---
object()   |    61 ns (*) |  62 ns
b'A' * 10  |    55 ns (*) |    51 ns (-7%)
b'A' * 10**3   |    99 ns (*) |  94 ns
b'A' * 10**6   |  37.5 us (*) |    36.6 us
'A' * 10   |    62 ns (*) |    58 ns (-7%)
'A' * 10**3    |   107 ns (*) | 104 ns
'A' * 10**6    |    37 us (*) |    36.6 us
'A' * 10**8    |  16.2 ms (*) |    16.4 ms
decode 10 null bytes from ASCII    |   253 ns (*) | 248 ns
decode 10**3 null bytes from ASCII |   359 ns (*) | 357 ns
decode 10**6 null bytes from ASCII |  78.8 us (*) |    78.7 us
decode 10**8 null bytes from ASCII |  26.2 ms (*) |    25.9 ms
(None,) * 10**0    |    30 ns (*) |  30 ns
(None,) * 10**1    |    78 ns (*) |  77 ns
(None,) * 10**2    |   427 ns (*) |   460 ns (+8%)
(None,) * 10**3    |   3.5 us (*) |   3.7 us (+6%)
(None,) * 10**4    |  34.7 us (*) |  37.2 us (+7%)
(None,) * 10**5    |   357 us (*) |   390 us (+9%)
(None,) * 10**6    |  3.86 ms (*) | 4.43 ms (+15%)
(None,) * 10**7    |  50.4 ms (*) |    50.3 ms
(None,) * 10**8    |   505 ms (*) | 504 ms
([None] * 10)[1:-1]    |   121 ns (*) | 120 ns
([None] * 10**3)[1:-1] |  3.57 us (*) |    3.57 us
([None] * 10**6)[1:-1] |  4.61 ms (*) |    4.59 ms
([None] * 10**8)[1:-1] |   585 ms (*) | 582 ms
---+--+---
Total  | 1.19 sec (*) |   1.19 sec
---+--+---

--
Added file: http://bugs.python.org/file35052/bench_alloc.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Charles-François Natali

Charles-François Natali added the comment:

Note to numpy devs: it would be great if some of you followed the
python-dev mailing list (I know it can be quite volume intensive, but
maybe simple filters could help keep the noise down): you guys have
definitely both expertise and real-life applications that could be
very valuable in helping us design the best possible public/private
APIs. It's always great to have downstream experts/end-users!

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Charles-François Natali

Charles-François Natali added the comment:

> This issue was opened to be able to use tracemalloc on numpy. I would
> like to make sure that calloc is enough for numpy. I would prefer to
> change the malloc API only once.

Then please at least rename the issue. Also, I don't see why
everything should be done at once: calloc support is a self-contained
change, which is useful outside of numpy. Enhanced tracemalloc support
for numpy certainly belongs to another issue.

Regarding the *Calloc functions: how about we provide a sane API
instead of reproducing the cumbersome C API?

I mean, why not expose:
PyAPI_FUNC(void *) PyMem_Calloc(size_t size);
insteaf of
PyAPI_FUNC(void *) PyMem_Calloc(size_t nelem, size_t elsize);

AFAICT, the two arguments are purely historical (it was used when
malloc() didn't guarantee suitable alignment, and has the advantage of
performing overflow check when doing the multiplication, but in our
code we always check for it anyway).
See
https://groups.google.com/forum/#!topic/comp.lang.c/jZbiyuYqjB4
http://stackoverflow.com/questions/4083916/two-arguments-to-calloc

And 
http://www.eglibc.org/cgi-bin/viewvc.cgi/trunk/libc/malloc/malloc.c?view=markup
to check that calloc(nelem, elsize) is implemented as calloc(nelem *
elsize)

I'm also concerned about the change to _PyObject_GC_Malloc(): it now
calls calloc() instead of malloc(): it can definitely be slower.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread STINNER Victor

STINNER Victor added the comment:

2014-04-27 10:30 GMT+02:00 Charles-François Natali :
>> I read again some remarks about alignement, it was suggested to provide 
>> allocators providing an address aligned to a requested alignement. This 
>> topic was already discussed in #18835.
>
> The alignement issue is really orthogonal to the calloc one, so IMO
> this shouldn't be discussed here (and FWIW I don't think we should
> expose those: alignement only matters either for concurrency or SIMD
> instructions, and I don't think we should try to standardize this kind
> of API, it's way to special-purpose (then we'd have to think about
> huge pages, etc...). Whereas calloc is a simple and immediately useful
> addition, not only for Numpy but also CPython).

This issue was opened to be able to use tracemalloc on numpy. I would
like to make sure that calloc is enough for numpy. I would prefer to
change the malloc API only once.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-27 Thread Charles-François Natali

Charles-François Natali added the comment:

> I read again some remarks about alignement, it was suggested to provide 
> allocators providing an address aligned to a requested alignement. This topic 
> was already discussed in #18835.

The alignement issue is really orthogonal to the calloc one, so IMO
this shouldn't be discussed here (and FWIW I don't think we should
expose those: alignement only matters either for concurrency or SIMD
instructions, and I don't think we should try to standardize this kind
of API, it's way to special-purpose (then we'd have to think about
huge pages, etc...). Whereas calloc is a simple and immediately useful
addition, not only for Numpy but also CPython).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-26 Thread STINNER Victor

STINNER Victor added the comment:

I read again some remarks about alignement, it was suggested to provide 
allocators providing an address aligned to a requested alignement. This topic 
was already discussed in #18835.

If Python doesn't provide such memory allocators, it was suggested to provide a 
"trace" function which can be called on the result of a successful allocator to 
"trace" an allocation (and a similar function for free). But this is very 
different from the design of the PEP 445 (new malloc API). Basically, it 
requires to rewrite the PEP 445.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-17 Thread Julian Taylor

Julian Taylor added the comment:

I just tested it, PyObject_NewVar seems to use RawMalloc not the GC malloc so 
its probably fine.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-17 Thread Josh Rosenberg

Josh Rosenberg added the comment:

Well, to be more specific, PyType_GenericAlloc was originally calling one of 
two methods that didn't zero the memory (one of which was GC_Malloc), then 
memset-ing. Just realized you're talking about something else; not sure if 
you're correct about this now, but I have to get to work, will check later if 
no one else does.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-17 Thread Josh Rosenberg

Josh Rosenberg added the comment:

Julian: No. See the diff: 
http://bugs.python.org/review/21233/diff/11644/Objects/typeobject.c

The original GC_Malloc was explicitly memset-ing after confirming that it 
received a non-NULL pointer from the underlying malloc call; that memset is 
removed in favor of using the calloc call.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-17 Thread Julian Taylor

Julian Taylor added the comment:

won't replacing _PyObject_GC_Malloc with a calloc cause Var objects 
(PyObject_NewVar) to be completely zeroed which I think they didn't before?
Some numeric programs stuff a lot of data into var objects and could care about 
python suddenly setting them to zero when they don't need it.
An example would be tinyarray.

--
nosy: +jtaylor

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-17 Thread Charles-François Natali

Charles-François Natali added the comment:

Do you have benchmarks?

(I'm not looking for an improvement, just no regression.)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-16 Thread STINNER Victor

STINNER Victor added the comment:

Patch version 3: remove _PyObject_GC_Calloc(), modify _PyObject_GC_Malloc() 
instead of use calloc() instead of malloc()+memset(0).

--
Added file: http://bugs.python.org/file34924/calloc-3.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-16 Thread Antoine Pitrou

Antoine Pitrou added the comment:

On mer., 2014-04-16 at 08:06 +, STINNER Victor wrote:
> I didn't check which objects use (indirectly) _PyObject_GC_Calloc().

I've checked: lists, tuples, dicts and sets at least seem to use it.
Obviously, objects which are not tracked by the GC (such as str and
bytes) won't use it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-16 Thread Stefan Krah

Stefan Krah added the comment:

I left a Rietveld comment, which probably did not get mailed.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-16 Thread STINNER Victor

STINNER Victor added the comment:

2014-04-16 3:18 GMT-04:00 Charles-François Natali :
>> It calls calloc(size) instead of malloc(size), calloc() which can be faster 
>> than malloc()+memset(), see:
>> https://mail.python.org/pipermail/python-dev/2014-April/133985.html
>
> It will only make a difference if the allocated region is large enough
> to be allocated by mmap (so not for 90% of objects).

Even if there are only 10% of cases where it may be faster, I think
that it's interesting to use calloc() to allocate Python objects. You
may create large Python objects ;-)

I didn't check which objects use (indirectly) _PyObject_GC_Calloc().

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-16 Thread STINNER Victor

STINNER Victor added the comment:

>>> So what is the point of _PyObject_GC_Calloc ?
>>
>> It calls calloc(size) instead of malloc(size)
>
> No, the question is why you didn't simply change _PyObject_GC_Malloc
> (which is a private function).

Oh ok, I didn't understand. I don't like changing the behaviour of
functions, but it's maybe fine if the function is private.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-16 Thread Charles-François Natali

Charles-François Natali added the comment:

>> So what is the point of _PyObject_GC_Calloc ?
>
> It calls calloc(size) instead of malloc(size), calloc() which can be faster 
> than malloc()+memset(), see:
> https://mail.python.org/pipermail/python-dev/2014-April/133985.html

It will only make a difference if the allocated region is large enough
to be allocated by mmap (so not for 90% of objects).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-15 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Le 16/04/2014 04:40, STINNER Victor a écrit :
>
> STINNER Victor added the comment:
>
>> So what is the point of _PyObject_GC_Calloc ?
>
> It calls calloc(size) instead of malloc(size)

No, the question is why you didn't simply change _PyObject_GC_Malloc 
(which is a private function).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-15 Thread STINNER Victor

STINNER Victor added the comment:

New patch:

- replace "size_t size" with "size_t nelem, size_t elsize" in the prototype of 
calloc functions (the parameter names come from the POSIX standard)
- replace "int calloc" with "int zero" in helper functions

--
Added file: http://bugs.python.org/file34903/calloc-2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-15 Thread STINNER Victor

STINNER Victor added the comment:

In numpy, I found the two following functions:


/*NUMPY_API
 * Allocates memory for array data.
 */
void* PyDataMem_NEW(size_t size);

/*NUMPY_API
 * Allocates zeroed memory for array data.
 */
void* PyDataMem_NEW_ZEROED(size_t size, size_t elsize);

So it looks like it needs two size_t parameters. Prototype of the C function 
calloc():

void *calloc(size_t nmemb, size_t size);

I agree that it's better to provide the same prototype than calloc().

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-15 Thread STINNER Victor

STINNER Victor added the comment:

> So what is the point of _PyObject_GC_Calloc ?

It calls calloc(size) instead of malloc(size), calloc() which can be faster 
than malloc()+memset(), see:
https://mail.python.org/pipermail/python-dev/2014-April/133985.html

_PyObject_GC_Calloc() is used by PyType_GenericAlloc(). If I understand 
directly, it is the default allocator to allocate Python objects.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-15 Thread Josh Rosenberg

Josh Rosenberg added the comment:

Sorry for breaking it up, but the same comment on consistent prototypes 
mirroring the C standard lib calloc would apply to all the API functions as 
well, e.g. PyMem_RawCalloc, PyMem_Calloc, PyObject_Calloc and 
_PyObject_GC_Calloc, not just the structure function pointer.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-15 Thread Josh Rosenberg

Josh Rosenberg added the comment:

Additional comment on clarity: Might it make sense to make the calloc structure 
member take both the num and size arguments that the underlying calloc takes? 
That is, instead of:

void* (*calloc) (void *ctx, size_t size);

Declare it as:

void* (*calloc) (void *ctx, size_t num, size_t size);

Beyond potentially allowing more detailed tracing info at some later point (and 
much like the original calloc, potentially allowing us to verify that the 
components do not overflow on multiply, instead of assuming every caller must 
multiply and check for themselves), it also seems like it's a bit more friendly 
to have the prototype for the structure calloc to follow the same pattern as 
the other members for consistency (Principle of Least Surprise): A context 
pointer, plus the arguments expected by the equivalent C function.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-15 Thread Josh Rosenberg

Josh Rosenberg added the comment:

General comment on patch: For the flag value that toggles zero-ing, perhaps use 
a different name, e.g. setzero, clearmem, initzero or somesuch instead of 
calloc? calloc already gets used to refer to both the C standard function and 
the function pointer structure member; it's mildly confusing to have it *also* 
refer to a boolean flag as well.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-15 Thread Antoine Pitrou

Antoine Pitrou added the comment:

So what is the point of _PyObject_GC_Calloc ?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-15 Thread Josh Rosenberg

Changes by Josh Rosenberg :


--
nosy: +josh.rosenberg

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-15 Thread STINNER Victor

Changes by STINNER Victor :


--
nosy: +neologix, pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-15 Thread STINNER Victor

STINNER Victor added the comment:

Here is a first patch adding the following functions:

  void* PyMem_RawCalloc(size_t n);
  void* PyMem_Calloc(size_t n);
  void* PyObject_Calloc(size_t n);
  PyObject* _PyObject_GC_Calloc(size_t);

It adds the following field after malloc field to PyMemAllocator structure:

  void* (*calloc) (void *ctx, size_t size);

It changes the tracemalloc module to trace "calloc" allocations, add new tests 
and document new functions.

The patch also contains an important change: PyType_GenericAlloc() uses calloc 
instead of malloc+memset(0). It may be faster, I didn't check.

--
keywords: +patch
Added file: http://bugs.python.org/file34897/calloc.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-15 Thread Éric Araujo

Changes by Éric Araujo :


--
nosy: +haypo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-15 Thread Stefan Krah

Changes by Stefan Krah :


--
nosy: +skrah

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21233] Add *Calloc functions to CPython memory allocation API

2014-04-15 Thread Nathaniel Smith

New submission from Nathaniel Smith:

Numpy would like to switch to using the CPython allocator interface in order to 
take advantage of the new tracemalloc infrastructure in 3.4. But, numpy relies 
on the availability of calloc(), and the CPython allocator API does not expose 
calloc().
  https://docs.python.org/3.5/c-api/memory.html#c.PyMemAllocator

So, we should add *Calloc variants. This met general approval on python-dev. 
Thread here:
  https://mail.python.org/pipermail/python-dev/2014-April/133985.html

This would involve adding a new .calloc field to the PyMemAllocator struct, 
exposed through new API functions PyMem_RawCalloc, PyMem_Calloc, 
PyObject_Calloc. [It's not clear that all 3 would really be used, but since we 
have only one PyMemAllocator struct that they all share, it'd be hard to add 
support to only one or two of these domains and not the rest. And the 
higher-level calloc variants might well be used. Numpy array buffers are often 
small (e.g., holding only a single value), and these small buffers benefit from 
small-alloc optimizations; meanwhile, large buffers benefit from calloc 
optimizations. So it might be optimal to use a single allocator that has both.]

We might also have to rename the PyMemAllocator struct to ensure that compiling 
old code with the new headers doesn't silently leave garbage in the .calloc 
field and lead to crashes.

--
components: Interpreter Core
messages: 216281
nosy: njs
priority: normal
severity: normal
status: open
title: Add *Calloc functions to CPython memory allocation API
type: enhancement
versions: Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



  1   2   >