delcypher added a comment.

In https://reviews.llvm.org/D28952#655337, @dcoughlin wrote:

> In https://reviews.llvm.org/D28952#655278, @ddcc wrote:
>
> > > - That said, I think there are still significant optimization 
> > > opportunities. It looks like when performing a query you create a new 
> > > solver for each set of constraints. My understanding (and perhaps @nlopes 
> > > can correct me here) is that these days Z3 is quite good at incremental 
> > > solves so you may be leaving a lot of performance on the table. For 
> > > example, in `getSymVal()` you create one solver to query for a satisfying 
> > > assignment and then later create a completely new one to determine 
> > > whether it is the only satisfying assignment. Since those two queries 
> > > share all of their conjuncts but one it might be faster to reuse the 
> > > first solver and add the new assertion for the second. Similarly, since 
> > > analyzer exploration is a DFS, you could imagine building up the path 
> > > condition incrementally. Z3 has solver APIs for pushing and popping 
> > > assertions. (Again, @nlopes may have a better idea of whether this would 
> > > pay off in practice.)
> >
> > I agree that the performance is the main problem, and that there are still 
> > a lot of performance optimizations available. I've separated the Z3 solver 
> > and model into separate classes, and reuse the solver now in `getSymVal()` 
> > and `checkNull()`. I looked at using the push/pop approach briefly when I 
> > started writing the solver, but since the static analyzer itself seems to 
> > have an internal worklist, I wasn't sure if the state traversal order is 
> > deterministic and predictable, so I didn't try to reuse a single solver 
> > across all program states. This is the new timing:
> >
> > Z3ConstraintManager (built-in, previous changes and solver state reuse): 
> > 202.37 sec
>
>
> The improvement here (200s vs. 250s) makes me think that some form of 
> caching/incrementalization could pay off.


KLEE has a few optimization ideas that you could consider implementing that 
certainly help in the context of symbolic execution.

- Constraint independence. This splits a set of constraints into independent 
constraints and solves them separately. This can be useful for further 
processing (e.g. increases the likely hood of a cache hit)
- "CounterExampleCache" . The naming is rather unfortunate and confusing 
(because KLEE works in terms of validity rather than satisfiability). The idea 
is to cache models for solver queries that are SAT (i.e. maintain a mapping 
from models to satisfying assignments with a sentinel to indicate when a query 
does not have a satisfying assignment) and when handling future queries
- If the new query is in the existing cache if it has a model it is SAT 
otherwise it is UNSAT.
- If the new query is a superset of an existing query in the cache then: If 
there was no model for the cached query then the super set query is also UNSAT. 
If there is a model for the original query try substituting it into the query 
(along with a deterministic values for symbolic values in the new query that 
aren't in the model) to try to quickly check if the cached model also satisfies 
the superset. If this fails then call the real solver
- If the new query is a subset of an existing query in the cache then: If there 
is a model for the cached query then that model will also satisfying the new 
query. If there is no model then the new query is UNSAT.

KLEE also does very aggressive constant folding and expression canonicalization 
to try to get more cache hits and have smaller constraints.


https://reviews.llvm.org/D28952



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to