STINNER Victor added the comment:

Nathaniel Smith:
> So PyMem_Malloc would just call PyMem_RecordAlloc("heap", ptr, size) (or act 
> equivalently to something that called that, etc.), but something like PyCuda 
> might do PyMem_RecordAlloc("gpu", ptr, size) to track allocations in GPU 
> memory.

If I change tracemalloc, it's not to fullfit numpy requirements, it must remain 
very generic. *If* we add something, I see 3 choices:

* add a C int to trace_t
* add a C char* to trace_t
* add a C void* to trace_t

int uses 2x less memory than char* or void* on 64-bit systems.

The problem of void* is to find a way to expose it in Python. An option is not 
ignore it in the Python API, and only provide a C API to retrieve traces with 
the extra info.

void* allows to implement the rejected option of also storing the C filename an 
C line number:
https://www.python.org/dev/peps/pep-0445/#pass-the-c-filename-and-line-number

When I designed the PEP 445 (malloc API) an PEP 454 (tracemalloc), I recall 
that it was proposed to add "colors" (red, blue, etc.) to memory allocations. 
It sounds similar do you "heap" and "gpu" use case. It's just that you use an 
integer rather than a string.

Anyway, extending tracemalloc is non-trivial, we have to investigate use cases 
to design the new API. I would prefer to move step by step, an begin with 
exposing existing API. What do you think?


> All the tracing stuff in tracemalloc would be awesome to have for GPU 
> allocations, and it would hardly require any changes to enable it. Ditto for 
> other leakable resources like file descriptors or shmem segments.

FYI I opened an issue to use tracemalloc when logging ResourceWarning:
http://bugs.python.org/issue26567


> I think the extra footprint would be tiny?

In my experience, tracemalloc footprint is large. Basically, it doubles the 
total memory footprint. So adding 4 or 8 bytes to a trace which currently takes 
16 bytes is not negligible!

Maybe we can be smart and use compact trace when extra info is not stored 
(current trace_t) and switch to "extended" trace (trace_t + int/char*/void*) 
when the extended API is used? It requires to convert all existing traces from 
the compact to the extende format. It doesn't look too complex to support two 
formats, expecially if the extended format is based on the compact format 
(trace_t structure used in extended_trace_t structure).


> Logically, you'd index traces by (domain, pointer) instead of (pointer)

It's not how tracemalloc is designed. _tracemalloc has a simple design. It's a 
simple hashtable: pointer => trace. The Python module tracemalloc.py is 
responsible to group traces:
https://docs.python.org/dev/library/tracemalloc.html#tracemalloc.Snapshot.statistics

The design is to have a simple and efficient _tracemalloc module, an off-load 
statistics later. It allows to capture traces on a small and slow device, and 
then analyze data on a fast compuer with more memory (ex: transfer data by 
network). The idea is also to limit the overhead of using _tracemalloc.

Moreover, your proposed structure looks specific to your use case. I'm not sure 
that you always want to group by the domain. If domain is a C traceback 
(filename, line number), you probably don't want to group by traceback, but 
group by C filename for example.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26530>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to