On 03/04/18 02:08, Alexei Starovoitov wrote:
> I like patch 3 and going to play with it.
> How did you test it ?
Just test_verifier and test_progs (the latter has a failure
 "test_bpf_obj_id:FAIL:get-prog-info(fd) err 0 errno 2 i 0 type 1(1) info_len 
80(40) jit_enabled 0 jited_prog_len 0 xlated_prog_len 72"
 but that was already there before my patch).

> Do you have processed_insn numbers for
> cilium or selftests programs before and after?
> There should be no difference, right?
That's a good check, I'll do that.

> As far as patch 1 it was very difficult to review, since several logically
> different things clamped together. So breaking it apart:
> - converting two arrays of subprog_starts and subprog_stack_depth into
>   single array of struct bpf_subprog_info is a good thing
> - tsort is interesting, but not sure it's correct. more below
> - but main change of combining subprog logic with do_check is no good.
<snip>
> There will be no 'do_check() across whole program' walk.
> Combining subprog pass with do_check is going into opposite direction
> of this long term work. Divide and conquer. Combining more things into
> do_check is the opposite of this programming principle.
The main object of my change here was to change the data structures, to
 store a subprogno in each insn aux rather than using bsearch() on the
 subprog_starts.  I have now figured out the algorithm to do this in its
 own pass (previously I thought this needed a recursive walk which is why
 I wanted to roll it into do_check() - doing more than one whole-program
 recursive walk seems like a bad idea.)

> My short term plan is to split basic instruction correctness checks
> out of do_check() loop into separate pass and run it early on.
I agree with that short term plan, sounds like a good idea.
I'm still not sure I understand the long-term plan, though; since most
 insns' input registers will still need to be checked (I'm assuming
 majority of most real ebpf programs consists of computing and
 dereferencing pointers), the data flow analysis will have to end up
 doing all the same register updates current do_check() does (though
 potentially in a different order), e.g. if a function is called three
 times it will have to analyse it with three sets of input registers.
Unless you have some way of specifying function preconditions, I don't
 see how it works.  In particular something like
    char *f(char *p)
    {
        *p++ = 0;
        return p;
    }
    int main(void)
    {
        char *p = "abc"; /* represents getting a ptr from ctx or map */
        p = f(p);
        p = f(p);
        p = f(p);
        return 0;
    }
 seems as though it would be difficult to analyse in any way more
 scalable than the current full recursive walk.  Unless you somehow
 propagate register state _backwards_ as constraints when analysing a
 function...?  In any case it seems like there are likely to be things
 which current verifier accepts which require 'whole-program' analysis
 to determine the dataflow (e.g. what if there were some conditional
 branches in f(), and the preconditions on p depended on the value of
 some other arg in r2?)

> As far as tsort approach for determining max stack depth...
> It's an interesting idea, but implementation is suffering from the same
> 'combine everything into one loop' coding issue.
> I think updating total_stack_depth math should be separate from sorting,
> since the same function can be part of different stacks with different max.
> I don't see how updating global subprog_info[i].total_stack_depth can
> be correct. It has to be different for every stack and the longest
> stack is not necessarily the deepest. May be I'm missing something,
> but combined sort and stack_depth math didn't make it easy to review.
> I find existing check_max_stack_depth() algo much easier to understand.
The sort _is_ the way to compute total stack depth.  The sort order isn't
 being stored anywhere; it's being done just so that each subprog gets
 looked at after all its callers have been considered.  So when it gets
 selected as a 'frontier node', its maximum stack depth is known, and can
 thus be used to update its callees (note that we do a max_t() with each
 callee's existing total_stack_depth, thus getting the deepest stack of
 all call chains to the function).
It may help to imagine drawing the call graph and labelling each node with
 a stack depth as it is visited; sadly that's difficult to show in an email
 (or a code comment).  But I can try to explain it a bit better than
 "/* Update callee total stack depth */".

I will also try to split up patch #1 into more pieces.  I mistakenly thought
 that existing check_max_stack_depth() depended on some invariants that I was
 removing, but I guess that was only true while I had non-contiguous subprogs.

-Ed

Reply via email to