[clang] [analyzer] Enforce not making overly complicated symbols (PR #144327)

Donát Nagy via cfe-commits Mon, 16 Jun 2025 06:45:41 -0700

https://github.com/NagyDonat commented:


Unfortunately I'm not convinced that this is the right direction for improving 
the analyzer runtime.

On the "risks" side I think that adding the corner case that "this may also 
return `UnknownVal` in rare situations" into many functions complicates the 
logic, burdens the code with early return branches and I fear that it will act 
as a footgun.

On the "benefits" side I fear that your statistics don't prove enough:
1. You found that _"Out of the worst 500 entry points, 45 were improved by at 
least 10%. Out of these 45, 5 were improved by more than 50%. Out of these 45, 
2 were improved by more than 80%."_ but this only covers 9% of the worst 500 
entry points. Eyeballing the graph suggests that there are some cases where the 
runtime actually got _worse_ -- so please check that the overall effect of the 
change is also positive (e.g. the total runtime is reduced meaningfully).
2. Moreover, if "worst 500 entry points" means "worst 500 in the first run", 
then it is _a biased sample_: if you pick the worst outliers (i.e. the entry 
points where the sum _expected runtime + luck factor_ is largest), then you are 
expected to get entry points with worse than average luck (because among two 
similar entry points, the one with bad luck ends up in the worst 500 while the 
one with good luck avoids it), so if you redo the measurement, then [regression 
toward the mean](https://en.wikipedia.org/wiki/Regression_toward_the_mean) will 
produce better results -- even if you do both measurements with the same setup! 
As a sanity check, please redo the statistics on the the entry points that 
produced the worst 500 runtimes _in the second run_ -- I fear that on that 
sample (which is biased in the opposite direction) you will see that the new 
revision is _worse_ than the baseline.
3. I'm also interested in comparing the statistical results with a second 
independent measurement -- is the set of "worst 500 entry points" stable 
between runs, or are these random unlucky functions that are hit with 
environmental issues?

If you can share the raw data, I can help with statistical calculations.

https://github.com/llvm/llvm-project/pull/144327
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Enforce not making overly complicated symbols (PR #144327)

Reply via email to