Re: analyzer: New state machine should be C++ only

2023-07-25 Thread Martin Uecker
Am Mittwoch, dem 12.07.2023 um 15:23 +0200 schrieb Benjamin Priour via
Gcc:
> Hi David,
> 
> 
> Lately I've been working on adding a new state machine to keep track
> of
> ownership transfers
> 
> and misuses, e.g. to warn about use-after-move, partial or shallow
> copy/move.
> 
> I'm trying to stay abstracted from heap allocated regions, and to
> rather
> work with "resources",
> 
> so that the state machine could be easily further extended.
> 
> However, the whole concern of ownership is really C++-like, and most
> of the
> checks would require
> 
> things unheard of in vanilla C, such as copy/move operators, ctors &
> dtors
> ...
> 
> 
> Using those constructs, it is really doable to guess ownership of
> resources, whereas without them it becomes
> 
> much more hazardous.
> 
> So, should we make this new sm -adroitly called sm-ownership- C++-
> only ?
> 
> 
> Doing so would allow the sm to reuse code from under cp/*, thus it'd
> reduce
> duplicating code and would
> 
> likely lead to less false positives in C++ -more precise function
> checks-,
> though it would make any future C-support more tedious.
> 
> It's also going against the current flow of porting what's already
> done for
> C to C++.

A gnu::owner attribute is certainly something which also make sense
for C.

Martin




Re: GNU Tools Cauldron 2023

2023-07-25 Thread Richard Earnshaw via Gcc
It is now just under 2 months until the GNU Tools Cauldron. 
Registration is still open, but we would really appreciate it if you 
could register as soon as possible so that we have a clear idea of the 
numbers.


Richard.

On 05/06/2023 14:59, Richard Earnshaw wrote:

We are pleased to invite you all to the next GNU Tools Cauldron,
taking place in Cambridge, UK, on September 22-24, 2023.

As for the previous instances, we have setup a wiki page for
details:

https://gcc.gnu.org/wiki/cauldron2023 




Like last year, we are having to charge for attendance.  We are still
working out what we will need to charge, but it will be no more than £250.

Attendance will remain free for community volunteers and others who do
not have a commercial backer and we will be providing a small number of
travel bursaries for students to attend.

For all details of how to register, and how to submit a proposal for a
track session, please see the wiki page.

The Cauldron is organized by a group of volunteers. We are keen to add
some more people so others can stand down. If you'd like to be part of
that organizing committee, please email the same address.

This announcement is being sent to the main mailing list of the
following groups: GCC, GDB, binutils, CGEN, DejaGnu, newlib and glibc.

Please feel free to share with other groups as appropriate.

Richard (on behalf of the GNU Tools Cauldron organizing committee).


Re: Update and Questions on CPython Extension Module -fanalyzer plugin development

2023-07-25 Thread David Malcolm via Gcc
On Tue, 2023-07-25 at 00:49 -0400, Eric Feng wrote:
> Hi all,

Hi Eric, thanks for the update.

Various comments inline below...

> 
> I would like to update everyone on the progress of the static
> analyzer
> plugin for CPython extension module code.
>  Since the last update, I
> have implemented known function subclasses for PyList_New and
> PyList_Append. The existing known function subclasses have also been
> enhanced to provide more information. For instance, we are now
> simulating object type specific fields in addition to just ob_refcnt
> and ob_type, which are shared by all PyObjects.

Do you have any DejaGnu tests for this functionality?  For example,
given PyList_New
  https://docs.python.org/3/c-api/list.html#c.PyList_New
there could be a test like:

/* { dg-require-effective-target python_h } */

#define PY_SSIZE_T_CLEAN
#include 
#include "analyzer-decls.h"

PyObject *
test_PyList_New (Py_ssize_t len)
{
  PyObject *obj = PyList_New (len);
  if (obj)
{
 __analyzer_eval (obj->ob_refcnt == 1); /* { dg-warning "TRUE" } */
 __analyzer_eval (PyList_Check (obj)); /* { dg-warning "TRUE" } */
 __analyzer_eval (PyList_CheckExact (obj)); /* { dg-warning "TRUE" } */
}
  else
__analyzer_dump_path (); /* { dg-warning "path" } */
  return obj;
}

...or similar, to verify that we simulate that the call can both
succeed and fail, and to verify properties of the store along the
"success" path.  Caveat: I didn't look at exactly what properties
you're simulating, so the above tests might need adjusting.

The:
  /* { dg-require-effective-target python_h } */
allows the testcase to bail out with "UNSUPPORTED" on hosts that don't
have a suitable Python.h header file installed, so that all this
Python-specific functionality is optional.  You could implement this
via a new function "check_effective_target_python_h" in
gcc/testsuite/lib/target-supports.exp, similar to the existing
check_effective_target_ functions in that Tcl file.


> 
> Regarding reference count checking, I have implemented a naive
> traversal of the store to count the actual reference count of
> PyObjects, allowing us to compare it against the ob_refcnt fields of
> the same PyObjects. Although we can compare the actual reference count
> and the ob_refcnt field, I am still working on implementing a
> diagnostic to alert about this issue.

Sounds promising.

> 
> In addition to the progress update, I have some implementation
> related
> questions and would appreciate any input. The current moment at which
> we run the algorithm for reference count checking, and thereby also
> the moment at which we may want to issue
> impl_region_model_context::warn, is within region_model::pop_frame.
> However, it appears that m_stmt and m_stmt_finder are NULL at the
> time
> of region_model::pop_frame, which results in the diagnostic for the
> reference count getting rejected. I am having trouble finding a
> workaround for this issue, so any ideas would be welcome.

FWIW I've run into a similar issue, and so did Tim last year.  The
whole stmt vs stmt_finder thing is rather ugly, and I have a local
branch with a part-written reworking of it that I hope will solve these
issues (adding a "class pending_location"), but it's very messy and far
from being ready.  Sorry about this.

In theory you can provide an instance of your own custom stmt_finder
subclass when you save the pending_diagnostic.  There's an example of
doing this in engine.cc: impl_region_model_context::on_state_leak,
where the leak is going to be reported at the end of a function
(similar to your diagnostic); it uses a leak_stmt_finder class, which
simply scans backwards once it has an exploded_path to find the last
stmt that had a useful location_t value.  This is a nasty hack, but
probably does what you need.

> 
> I am also currently examining some issues related to state merging.

Note that you can use -fno-analyzer-state-merge to turn off the state
merging code, which can be very useful when debugging this kind of
thing.

> Let's consider the following example which lacks error checking:
> 
> PyObject* foo() {
>     PyObject item = PyLong_FromLong(10);
>     PyObject list = PyList_New(5);
>     return list;
> }
> 
> The states for when PyLong_FromLong fails and when PyLong_FromLong
> succeeds are merged before the call to PyObject* list =
> PyList_New(5).

Ideally we would emit a leak warning about the "success" case of
PyLong_FromLong here.  I think you're running into the problem of the
"store" part of the program_state being separate from the "malloc"
state machine part of program_state - I'm guessing that you're creating
a heap_allocated_region for the new python object, but the "malloc"
state machine isn't transitioning the pointer from "start" to "assumed-
non-null".  Such state machine states inhibit state-merging, and so
this might solve your state-merging problem.

I think we need a way to call malloc_state_machine::on_allocator_call
from outside of sm-malloc.cc.  Se