There is a memory leak path in the preprocessor (file tccpp.c). The traced path
was specifically within the macro argument substitution logic in
macro_arg_subst function, but other paths seem to be affected as well, like
"expr_preprocess" where token string re/allocations are performed as well,
while holding the reallocated pointer in the stack only.
The leak occurs (on the traced path) when a syntax or token error is
encountered during the expansion of a macro argument, and the program exits the
current scope non-locally via an assertion ("expect" assert).
The code path is:
File: tccpp.c
Function: macro_arg_subst
The specific call chain: macro_arg_subst -> tok_str_add2 -> tok_str_realloc ->
tal_realloc_impl
The error exit: An assertion or token expectation failure, such as expect(...),
which calls tcc_error().
The memory leak is a consequence of the preprocessor's memory model—which uses
the TinyAlloc (TAL) pool (tokstr_alloc) for efficiency—interacting poorly with
the non-local error handling mechanism (which can use longjmp or a program exit
depending on compiler state).
I see that the TinyAlloc Pool Allocation seems to be used for most small
string/tokens,
tal_realloc_impl successfully allocates the memory from the internal,
pre-allocated buffer of
the tokstr_alloc pool. When the compiler is cleaned up (e.g., via tcc_delete),
this entire pool is freed in one call. No leak occurs here.
And, when the TOKSTR_TAL_LIMIT is exhausted, the External Heap Allocation is
used
(which is the leaking path):
The function tok_str_realloc uses tal_realloc_impl, which switches to handling
large requests by
falling back to the standard heap.
If the token string's required size (TinyAlloc->limit) exceeds the internal
buffer
limit of the tokstr_alloc pool, tal_realloc_impl will use the underlying
general allocator, which is the system's tcc_realloc (defaulting to standard
library realloc).
This dynamically allocated block of memory is then pointed to by the local
(stack allocated) "TokenString str" variable.
If, immediately after an external heap allocation has occurred, a token error
is raised (e.g., expect("...") fails): "expect" calls tcc_error().
tcc_error() performs a non-local jump (e.g., via longjmp) or exits the process.
The normal function return and stack unwinding for macro_arg_subst is
completely bypassed.
The local pointer holding the address of the externally-allocated heap memory
is immediately lost from the stack, without a corresponding call to tcc_free()
(or tok_str_free).
The standard heap memory is therefore never tracked for deallocation by
tcc_delete (which only cleans up TAL pools), and the pointer to the memory is
lost, resulting in a permanent memory leak.
For my local fork of TCC I will implement the simplest fix by explicitly
freeing the
temporary token string before asserting the error.
For the mob branch, we can consider adding a cleanup call to the token string
object just before calling the expect function, if the token object is known to
contain temporary expansion data that could be on the heap. That would be the
case for macro_arg_subst and expr_preprocess, but the parse_escape_string
function might need to be accounted for as well (depending on the ownership of
the "outstr" argument)
Thank you for your time and consideration.
_______________________________________________
Tinycc-devel mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel