NoQ added a comment.

Aha, ok, yeah, we should have seen this coming. Whenever a checker tracks pairs 
of objects, like containers and iterators, or strings objects and their 
internal buffers (https://reviews.llvm.org/D48522), or return values and 
out-parameters of the same function call (https://reviews.llvm.org/D32449), we 
should expect races on which of the two dies first. Such races are inevitable 
because any of the two may be arbitrarily stored for an indefinite amount of 
time. The test that you see failing is one of the tricky cases to understand 
because it's about liveness of a parameter region, which is a bit 
counter-intuitive.

I believe that //the checker should deal with symbol/region lifetime races, not 
try to prevent them//. The checker should only extend the lifetime of items it 
tracks in `checkLiveSymbols()` when the checker has additional information 
about how the programmer can access the tracked object despite not having a 
direct reference to it in the imperfect symbolic memory. Otherwise we're 
sacrificing our fairly precise liveness tracking, which serves two important 
purposes:

1. finding leaks (eg., if the container is allocated on the heap, we may get a 
false negative leak when an iterator suddenly starts keeping it alive) and

2. keeping states small for faster lookups (even though state cleanups is a top 
performance bottleneck that occupies 30-50% of the analyzer's performance, 
removing cleanups only makes things worse because state lookups become slower).

When solving liveness races, the key understanding is that when the object 
becomes dead, all information we gathered during analysis about the object is 
final. Such information will never become more precise, and it will never 
mutate. For instance, because the object cannot be accessed anymore //during 
the current analysis//, its begin() and end() locations will never change. The 
"during the current analysis" part is important: for example, in your test 
`simple_bad_end()`, the vector is marked as dead even though the previous stack 
frame definitely still has access to it, because the analysis is only conducted 
within the current stack frame and will never leave it. If we started our 
analysis from the caller, the vector would not have been marked as dead yet.

Therefore i believe that when the container goes out of scope, the iterator 
should be "disconnected" from the container, and any information we've had 
about the container should now be associated with the iterator itself. In your 
case it probably means connecting the iterator directly to the begin/end 
symbols. That puts a bit of stress into your program state trait structure 
because there are now two sorts of iterator info structures to handle (one that 
directs all queries to the container, another that has all the info within it 
directly). But that's the usual cost of handling a liveness race - cf. 
"Schrödinger mutex states" in https://reviews.llvm.org/D32449, which are only 
as different from your case as the nature of the objects you're tracking is 
different.

I'm slightly worried that it might potentially be possible to mutate the 
container significantly by only using an interator, without having access to 
the object itself. It doesn't seem possible - at most you'll be able to mutate 
the element to which the iterator points, but you can't eg. change the size of 
the container or invalidate other iterators. If you can in fact mutate the 
container significantly through an iterator, then we'll need to re-think this, 
and enforcing liveness may be the right thing to do (though we may also get 
away with simply keeping a dead region in the program state solely for 
convenience).


https://reviews.llvm.org/D48427



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to