NoQ added a comment.

In https://reviews.llvm.org/D35216#856093, @rsmith wrote:

> The `CXXStdInitializerListExpr` node has pretty simple evaluation semantics: 
> it takes a glvalue array expression, and constructs a 
> `std::initializer_list<T>` from it as if by filling in the two members with a 
> pointer to the array and either the size of the array or a pointer to its 
> end. Can we not model those semantics directly here?


Yeah, it's definitely an easy case to perform complete modeling. However, in 
the analyzer we still lack some infrastructure for C++ library modeling, which 
was discussed quite a lot above, and i feel we've run out of debating time and 
still need to fix the false positive. It's important to make our checker APIs 
more powerful before we jump into modeling myriads of STL classes. The main 
problem is how to represent the private fields in the 
`std::initializer_list<T>` object in our symbolic memory model. Here's a quick 
summary of the above.

We already have the object's fields in the AST, with its AST record layout, 
however field names and layouts are implementation-dependent, and it is unsafe 
to try to understand how the specific standard library implementation's layout 
works and what specific private fields we see in the AST mean. Instead it seems 
safer to attach metadata to the object, which would contain values of these 
private fields - this is a common operation for the analyzer (eg. attach string 
length metadata to a null-terminated C-string).

However, because such metadata should be handled by an analyzer's checker (we 
cannot possibly model all APIs in the core, moving it into checkers is much 
more scalable and natural), and therefore such metadata would be private to the 
checker in one way or another, it makes checkers fully responsible for modeling 
such metadata. In particular, the very feature lack of which that was causing 
the false positive (expressing the fact that if the list is passed into an 
external function (eg. with no visible body available for analysis), objects 
stored in it are also "escaping" into a function and can be, for example, 
deallocated here, so other checkers need to be notified to stop tracking them) 
cannot currently be implemented on the checker side. If we let checkers express 
this, it'd require an additional checker callback boilerplate that would need 
to be repeated in every C++ checker, and we already have a lot of boilerplate. 
Additionally, when constructing the `initializer_list`'s elements, we already 
need to have the storage for them, which is hard to handle by the core when 
such storage is checker-specific.

Another solution would be to let the analyzer core help checkers with the 
simple boilerplate, which is a long-standing project that is to be took up. The 
only downside of this approach is that it might make checker APIs less 
omnipotent, because they can no longer maintain their metadata in arbitrary 
manners, but only in manners that are understandable by the core.

An alternative solution described here was to let the checkers produce 
"imaginary" fields within our symbolic memory model, with unknown offsets 
within the object (similar to objective-c instance variables) that would have 
benefits of no longer being implementation-defined. The difference with 
maintaining metadata is that most of the boilerplate would be handled by the 
memory model (RegionStore) rather than the checker, and the core would 
generally know that the pointers to the array are located within the object. 
However, this also raises multiple concerns, because such imaginary "ghost" 
fields may conflict with the real fields in weird manners, they may 
accidentally leak to user's diagnostics, and what not. And we still have the 
problem with checker-specific storage.

So these are the excuses :)


https://reviews.llvm.org/D35216



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to