On 11/5/20 12:31 AM, Richard Biener wrote:
On Thu, Nov 5, 2020 at 1:59 AM Martin Sebor via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:

To determine the target of a pointer expression and the offset into
it, the increasingly widely used compute_objsize function traverses
the IL following the DEF statements of pointer variables, aggregating
offsets from POINTER_PLUS assignments along the way.  It does that
for many statements that involve pointers, including calls to
built-in functions and (so far only) accesses to char arrays.  When
a function has many such statements with pointers to the same objects
but with different offsets, the traversal ends up visiting the same
pointer assignments repeatedly and unnecessarily.

To avoid this repeated traversal, the attached patch adds the ability
to cache results obtained in prior calls to the function.  The cache
is optional and only used when enabled.

To exercise the cache I have enabled it for the strlen pass (which
is probably the heaviest compute_objsize user).  That happens to
resolve PR 97373 which tracks the pass' failure to detect sprintf
overflowing allocated buffers at a non-constant offset.  I thought
about making this a separate patch but the sprintf/strlen changes
are completely mechanical so it didn't seem worth the effort.

In the benchmarking I've done the cache isn't a huge win there but
it does have a measurable difference in the project I'm wrapping up
where most pointer assignments need to be examined.  The space used
for the cache is negligible on average: fewer than 20 entries per
Glibc function and about 6 for GCC.  The worst case in Glibc is
6 thousand entries and 10k in GCC.  Since the entries are sizable
(216 bytes each) the worst case memory consumption can be reduced
by adding a level of indirection.  A further savings can be obtained
by replacing some of the offset_int members of the entries with
HOST_WIDE_INT.

The efficiency benefits of the cache should increase further as more
of the access checking code is integrated into the same pass.  This
should eventually include the checking currently done in the built-in
expanders.

Tested on x86_64-linux, along with Glibc and Binutils/GDB.

I'm quite sure the objsz pass already has a cache, why not
re-use it instead of piggy-backing another one onto its machinery?

compute_objsize() and the objsz pass are completely independent.
The pass is also quite limited in that it doesn't make use of
ranges.  That limitation was also the main reason for introducing
the compute_objsize() function.

I'd love to see the objsize pass and compute_objsize() integrated
and exposed under an interface similar to range_query, with
the information available anywhere, and on demand.  I might tackle
it some day, but I expect it will be a nontrivial project, much
bigger than letting compute_objsize() do this simple caching for
the time being.

Martin


Richard.

Martin

PS The patch add the new pointer_query class (loosely modeled on
range_query) to builtins.{h,c}.  This should be only temporary,
until the access checking code is moved into a file (and ultimately
a pass) of its own.

Reply via email to