balazs-benics-sonarsource wrote: > Unfortunately I'm not convinced that this is the right direction for > improving the analyzer runtime. > > On the "risks" side I think that adding the corner case that "this may also > return `UnknownVal` in rare situations" into many functions complicates the > logic, burdens the code with early return branches and I fear that it will > act as a footgun. > > On the "benefits" side I fear that your statistics don't prove enough: > > 1. You found that _"Out of the worst 500 entry points, 45 were improved by at > least 10%. Out of these 45, 5 were improved by more than 50%. Out of these > 45, 2 were improved by more than 80%."_ but this only covers 9% of the worst > 500 entry points. Eyeballing the graph suggests that there are some cases > where the runtime actually got _worse_ -- so please check that the overall > effect of the change is also positive (e.g. the total runtime is reduced > meaningfully). > 2. Moreover, if "worst 500 entry points" means "worst 500 in the first run", > then it is _a biased sample_: if you pick the worst outliers (i.e. the entry > points where the sum _expected runtime + luck factor_ is largest), then you > are expected to get entry points with worse than average luck (because among > two similar entry points, the one with bad luck ends up in the worst 500 > while the one with good luck avoids it), so if you redo the measurement, then > [regression toward the > mean](https://en.wikipedia.org/wiki/Regression_toward_the_mean) will produce > better results -- even if you do both measurements with the same setup! As a > sanity check, please redo the statistics on the the entry points that > produced the worst 500 runtimes _in the second run_ -- I fear that on that > sample (which is biased in the opposite direction) you will see that the new > revision is _worse_ than the baseline. > 3. I'm also interested in comparing the statistical results with a second > independent measurement -- is the set of "worst 500 entry points" stable > between runs, or are these random unlucky functions that are hit with > environmental issues? > > If you can share the raw data, I can help with statistical calculations.
Could you paraphrase your concerns? How I read this you have mainly 2 concerns: 1) The use of this strong-type makes it tedious the existing APIs to use because one needs to unwrap the value and frequently make an early-return to explicitly handle the case when a symbol-creation failed? 2) The measurements weren't conclusive. There was no evidence provided that on the usual cases (all entry points) the RT would not regress. It was also not fair to look at only the longest 500 entry points to evaluate the effectiveness of limiting the max symbol complexity (in other words, honoring the max symbol complexity limit). > [...] [regression toward the > mean](https://en.wikipedia.org/wiki/Regression_toward_the_mean) [...] I formulated the test case in the tests by inspecting a long-running test case. It is consistently low-performing. You can also check it on godbolt, that it would not finish, because the symbol simplifier would need to do so much work due walking the overly complicated symbols. I've also inspected about 10 (a couple of months ago) other radically improved cases, and all showed a large number of binary op manipulations like hashing, or the test case I supply in this patch. This doesn't seem to be a coincidence to me. From my experience, our pool is large enough to be consistent and roughly reproducible for long-ish entry points. According to our data, usually an entry point should finish in about 1 second if not less. Above that suggests something to look at. To me, encountering symbols with complexity over the dedicated max symbol complexity threshold is a bug. Can you think of other ways to ensure we never create overly complicated symbols? https://github.com/llvm/llvm-project/pull/144327 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits