NagyDonat wrote:

> I'm gonna be honest with you that my approach was so far to find something 
> similar to what I want and copy paste it. Then carefully check the exploded 
> graph to see if it's correct.

After surveying code that uses `NodeBuilder`s my impression is that probably 
everyone followed this approach (or at least the first part, I wouldn't bet on 
everyone "carefully check"ing. There are several code fragments that are 
completely non-functional (e.g. add a node to a set, then immediately remove 
it), but as they do not cause any problems, they preserved and copied around.

This approach has the advantage that it actually gets things done (instead of 
digging deep in rabbit holes and unearthing the skeleton armies buried in our 
codebase), but I fear that it doesn't cover the "rare" cases.

In fact, even before I read this comment, I had the feeling that these part of 
the engine are "fragile" in the sense that they have lots of quirks that are 
usually irrelevant, but can differ from the "natural" behavior in some rare 
corner case. (E.g. adding a node to a set then removing it is usually a no-op, 
but if the node was somehow already present in the set, then it gets removed, 
which may be an error.)

My intuition is that anything that uses `NodeBuilder`s has a chance to behave 
incorrectly on exceptional input:
- encountering a sink node from a checker (on callbacks where that is rare),
- encountering a "node already exists" situation (cache out, node creation 
returns `nullptr`),
- encountering a posteriorly overconstrained state (probably the most 
exceptional case).

This is my primary motivation for cleaning up these parts of the codebase. Our 
data model is full of rare-but-can-happen-anywhere exceptional things (e.g. 
Undefined/Unknown values, sinks, caching out, posteriorly overconstrained 
states, reaching various budgets and complexity limits etc.) so I feel that it 
is infeasible to cover all corner cases experimentally with tests -- and 
instead I'm trying to ground the correctness of the code in theoretical 
understanding, invariants and readability. (Of course, we should _also_ have 
tests in addition to theory, but I feel that it is not enough to test the 
project as a black box full of skeletons.)

> Objectively, the engineers were really smart and had so many abstraction 
> layers over time, and only few of them made long term sense.

I feel that code written around 2010-12 (IIUC the early days of the analyzer) 
often contains overeager pretentious abstractions and generalizations that are 
challenging the readers ("are you as smart as me?") but provide no practical 
benefits. I don't know whether these are products of hubris ("I'm SO smart"), 
written with well-meaning naivety ("generic solutions will help development in 
the future") or just ruins of an ancient age -- maybe they _had_ some practical 
benefits in the past which was lost during some refactoring.

I also admit that there were some old abstractions that appeared to be useless 
at first, but since then I realized that they have some relevant (and sometimes 
even important) role.

In fact, since I wrote my complaints about the overcomplicated management of 
the `Frontier` (in `NodeBuilder`s), I realized that this is probably relevant 
_during the evaluation of checker callbacks_ (when 
`CheckerContext::addTransition` calls can and do perform arbitrarily complex 
transition sequences and it is important to continue with the _frontier_ after 
the checker callback).

My current plan is that I would move this `Frontier` management to 
`CheckerContext` -- because it is irrelevant and confusing in the dozens of 
other throwaway `NodeBuilder`s that are never used for checker execution.

> I don't mind and I actually want to encourage anyone simplifying things. One 
> way is to try to get rid of something and if it was easy, then chances are 
> that the "thing" wasn't really needed after all.
I like where this is going.

I hope that I will be able to meaningfully simplify the codebase :smile:

However it is likely that I won't do this immediately. During the next days 
I'll try to make `switch` statements trigger the `BranchCondition` callback (I 
have an unpublished almost-ready commit which should work correctly now that 
`SwitchNodeBuilder`s are `NodeBuilder`s) and then merge 
optin.core.UnconditionalVAArg 
(https://github.com/llvm/llvm-project/pull/175602, which would work correctly 
once `switch` triggers the `BranchCondition` callback). After these are done, 
I'll switch to working on `security.ArrayBound` and generalizing its logic for 
use in other checkers (like e.g. `ReturnPtrRange`), and I will work on this 
`NodeBuilder` removal only after that (or within "breaks" when I'm blocked on 
the bounds checking front).

https://github.com/llvm/llvm-project/pull/181431
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to