Re: fanalyzer: debugging zero state machine

David Malcolm via Gcc Thu, 09 Jun 2022 10:40:26 -0700

On Thu, 2022-06-09 at 16:49 +0200, Tim Lange wrote:
> 
>  > On Mi, Jun 8 2022 at 11:12:52 -0400, David Malcolm 
> <dmalc...@redhat.com> wrote:
>  > > On Wed, 2022-06-08 at 01:42 +0200, Tim Lange wrote:
>  > >
>  > > Hi Dave,


Hi Tim; various responses inline below...

>  > >
>  > > I did spent some time to think about the zero state machine. I 
> first
>  > > thought about distinguishing between "assigned zero" and "EQ 0
>  > > condition on the path" for cases where e.g. unreachable() is
> used 
> to
>  > > say that some variable will never be zero according to the 
> programmer.
>  > > In that case the dev might not want zero warnings from
> conditions
>  > > outside the if body itself for dev build where unreachable
> expands 
> to
>  > > some panic exit. But as the condition constraint is not pruned 
> when the
>  > > state machine is distinguishing the states, I'm not sure how to 
> know
>  > > whether the analysis already left the if body?
>  >
>  > The analyzer works on the gimple-ssa representation, which uses
> basic
>  > blocks in a CFG, rather than an AST, so the only remants we have
> of
>  > scope is in "clobber" statements (a special kind of assignment
> stmt),
>  > which the gimplify adds as variables go out of scope.
> If the constraints only lived until the immediate dominator of `if 
> (cond)`, I could easily distinguish:
> 1. if (x == 0) && still inside the if => zero
> 2. if (x == 0) && outside if => maybe zero
> but as this seems to be not the case, if I want to distinguish 1. &
> 2., 
> I'd have to find another way.
>  >
>  > For pruning, the analyzer's state_machine class has a
> "can_purge_p"
>  > virtual function:
>  >
>  > /* Return true if it safe to discard the given state (to help
>  > when simplifying state objects).
>  > States that need leak detection should return false. */
>  > virtual bool can_purge_p (state_t s) const = 0;
>  >
>  > which should return true for a "zeroness" state machine, in that
> we
>  > always consider pruning states for svalues that aren't needed
> anymore
>  > along a path.
> Is implemented and returns true.
>  >
>  > Is there some other kind of state explosion you're running into?
> It's
>  > hard to figure this out further without seeing code.
> No, my code is by far not that mature to be tested. I just had in my 
> head that I wanted to find out if I can distinguish the two cases.
>  >
>  >
>  > > Also, while trying out different things, it seems simple 
> assignments on
>  > > phi like here
>  > > int x;
>  > > if (argc == 1) {
>  > > x = 1; // x_5
>  > > } else {
>  > > x = 0; // x_4
>  > > }
>  > > printf("%i", 5 / x); // x_2
>  > > automatically work such that x_2 already inherits the state from
>  > > x_4/x_5 without me doing anything inside my sm's on_phi
> function. 
> Same
>  > > for the simple y = 0; x = y; case. Where does this happen inside
> the
>  > > code?
>  >
>  > With the caveat that I'm seeing your code, what's probably
> happening 
> is
>  > that we have either:
>  >
>  > BB (a):
>  > x_5 = 1;
>  > goto BB (c);
>  >
>  > BB (b):
>  > x_4 = 0;
>  > goto BB (c);
>  >
>  > BB (c):
>  > x_2 = PHI (x_5 from (a), x_4 from (b));
> I compiled it with -g, so this one is like the dumped gimple.
>  >
>  > or (at higher optimization levels):
>  >
>  > BB (a):
>  > goto BB (c);
>  >
>  > BB (b):
>  > goto BB (c);
>  >
>  > BB (c):
>  > x_2 = PHI (1 from (a), 0 from (b));
>  >
>  > and I think that at the phi node we have region_model::handle_phi,
>  > which is setting x_2 to either the constant 1 or the constant 0 in
> the
>  > store, and is calling the on_phi vfunc, leading to on_phi being 
> called
>  > for all state machines.
> Thanks, that is the case. The set_value inside handle_phi seems to
> this 
> for me.
>  >
>  > BTW, are you implementing an override for this vfunc:
>  > virtual state_machine::state_t get_default_state (const svalue *)
>  > const;
>  >
>  > to capture the inherently known zeroness/nonzeroness of 
> constant_svalue
>  > instances? That would make those constants have that state.
> Yeah, I saw that on your nullness check. I tried it on a small
> example 
> with and without, but didn't noticed a difference in warnings (except
> for not having zero(x_4) inside the supergraph.dot). So if I
> understood 
> this right, this is just to have one state less for that 
> variable/value[0]?

The states are stored in sm_state_map using a mapping from svalue to
state.

Given e.g. a parameter, this will be "initial_svalue (parm)", but in
the above case where x_4 either has value 1 or has value 0, it's not
storing a state for x_4; it's storing a state for 0 or a state for 1. 
So without implementing the get_default_state vfunc, the sm_state_maps
will gradually aquire explicit "this is zero" or "this is nonzero" for
all of the constants that get used, leading to an explosion of states
for all the different combinations of constants that have been
encountered along an execution path, which is probably not going to be
useful.  If you do implement the get_default_state vfunc for constants,
then the constants will implicitly have the zero vs nonzero states, and
this information won't get stored in the sm_state_map instances.

program_state::can_merge_with_p compares the sm_state_map instances
between the two program_state instances - and so if the
get_default_state vfunc is implemented for constants, the peer
sm_state_maps for the "zero" state machine will both be empty, and thus
equal, and thus it will merge state.

So it might make sense to revamp this sm_state_map comparison so that
we consider implicit states in sm_state_maps when merging the store, so
that we see that "1" and "0" have different sm-states for the "zero"
state machine, and thus we shouldn't merge states for these.  See
binding_cluster::can_merge_p, which calls svalue::can_merge_p.  It
might make sense to have svalue::can_merge_p "know" about sm-state-maps
for this.  

svalue::can_merge_p calls model_merger::mergeable_svalue_p, so that we
can reject merging values that explicitly have different state-machine
states, but that code could be reworked so that it considers whether a
pair of svalues can be merged, using implicit state-machine states (and
thus have it reject the merger of a zero with a nonzero constant).

Or, whilst prototyping, you could simply hack in a rejection of merging
zero vs non-zero integer values into svalue::can_merge_p.

> 
> If that is right, is it also favorable to "merge" the stop state and 
> non_zero state inside the zero state machine because - for now -
> there 
> is no warning planned on non-zero values?
> 
> [0] less states are favorable because then the analyzer maybe has
> less 
> different enodes to visit and thus less runtime(?)
>  >
>  > Thanks
>  > Dave
>  >
>  > > - Tim
> 
> Also another question unrelated to the ones before. I do have a weird
> bug in my zero sm[1] but I'm unsure where my sm is flawed. Take for 
> example, the probably simplest case:
>   int x = 0;h
>   printf("%d", 42 / x);
> If I use inform to emit a notice for my state machine result, it
> seems 
> to be correct and I do get following traversal order (by printing the
> gimple stmt inside on_stmt):
>   x_2 = 0;
>   _1 = 42 % x_2;
>   # .MEM_4 = VDEF <.MEM_3(D)>
>   printf ("%d", _1);
>   _5 = 0;
>   <L0>:
>   # VUSE <.MEM_4>
>   return _5;
> But if i use my zero_diagnostics and sm_ctxt->warn to emit the
> warning, 
> I get an unexpected traversal order of:
>   x_2 = 0;
>   _1 = 42 / x_2;
>   # .MEM_4 = VDEF <.MEM_3(D)>
>   printf ("%d", _1);
>   _5 = 0;
>   <L0>:
>   # VUSE <.MEM_4>
>   return _5;
>   x_2 = 0;              <-- why do I get these stmts again but they
> are not as 
> duplicated in the e-supergraph?
>   _1 = 42 / x_2;
>   _5 = 0;
>   _1 = 42 / x_2;
>   _5 = 0;

What exactly is your test source file?

Note that a function being called by another function will get explored
in the exploded_graph twice: once being analyzed "standalone", and
again being analyzed as called from the other function.

When you say "I get these stmts again", do you mean the state machine
code is seeing them again, or the region model code?

  sm_ctxt->warn (new foo)

will save an instance of a pending_diagnostic into the
diagnotic_manager, associating it with a particular exploded_node (and
some other info).

After the exploded_graph has been built, the diagnostic_manager does
some deduplication work, and then tries to find a feasible path to each
pending_diagnostic's exploded node.

This feasibility analysis reruns much of the region_model code that was
run when building the exploded graph.  So that's another way you can
"see" stmts repeatedly.


> I tracked the cause down the call stack to 
> m_saved_diagnostics.safe_push (sd); inside 
> diagnostic_manager::add_diagnostic. For whatever reason, the pushing
> of 
> diagnostics bricks the zero sm. It might be a embarrassing error on
> my 
> side, but I'm stuck on this and can't seem to find what I'm doing
> wrong.

FWIW, assuming you're using the gdb support scripts:
  
https://gcc-newbies-guide.readthedocs.io/en/latest/debugging.html#support-scripts-for-gdb
there's a handy:

(gdb) break-on-saved-diagnostic

command which adds a breakpoint on the analyzer's diagnostic_manager
saving a pending_diagnostic.

You might also what to try -fdump-analyzer-stderr, which will dump lots
of information on what the analyzer is doing to stderr, which although
verbose is usually very helpful in figuring out why the analyzer is
doing something.

Hope this is helpful
Dave



> 
> - Tim
> 
> [1] 
>  
> https://github.com/timll/gcc/blob/castsize/gcc/analyzer/sm-zero.cc#L181
> 
>

Re: fanalyzer: debugging zero state machine

Reply via email to