[RFC PATCH v2 bpf-next 0/2] verifier liveness simplification

2018-08-22 Thread Edward Cree
The first patch is a simplification of register liveness tracking by using
 a separate parentage chain for each register and stack slot, thus avoiding
 the need for logic to handle callee-saved registers when applying read
 marks.  In the future this idea may be extended to form use-def chains.
The second patch adds information about misc/zero data on the stack to the
 state dumps emitted to the log at various points; this information was
 found essential in debugging the first patch, and may be useful elsewhere.

Edward Cree (2):
  bpf/verifier: per-register parent pointers
  bpf/verifier: display non-spill stack slot types in
print_verifier_state

 include/linux/bpf_verifier.h |   8 +-
 kernel/bpf/verifier.c| 216 ++-
 2 files changed, 72 insertions(+), 152 deletions(-)



Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification

2018-09-26 Thread Jiong Wang

On 22/08/2018 20:00, Edward Cree wrote:

The first patch is a simplification of register liveness tracking by using
 a separate parentage chain for each register and stack slot, thus avoiding
 the need for logic to handle callee-saved registers when applying read
 marks.  In the future this idea may be extended to form use-def chains.


Interesting.

This could potentially help efficient code-gen for 32-bit architectures and I
had been thinking some implementations along this line and would like to have
a discussion.

Program description
===
It will be good if we could know one instruction is safe to operate on low
32-bit sub-register only. If this is true:
  - 32-bit arches could save some code-gen.
  - 64-bit arches could utilize 32-bit sub-register instructions.

Algorithm
===
Based on the current verifier code-path-walker, it looks to me we could get
32-bit safety information using the following algorithm:

  1. assume all instructions operate on 32-bit sub-register initially.
  2. link each insn to insns which define its use.
  3. for a register use, if it is a 64-bit write back to memory, or if it is
 a comparison-then-jump based on 64-bit value, or if it is a right shift,
 then consider the use is a 64-bit use, then all its define insns and their
 parents define insns should be marked as need full 64-bit operation.
  4. currently, register read (REG_LIVE_READ) will be propagated to parent
 state when path prune happened, and REG_LIVE_WRITTEN would screen off such
 propagation. We need to propagate 64-bit mark in similar way, but
 REG_LIVE_WRITTEN shouldn't screen off backward 64-bit mark propagation if
 the written reg also used as source reg (this has raised REG_LIVE_READ
 propagation but it is not necessarily indicating 64-bit read) in the same
 instruction.

Implementation
===
And it seems there could be an implementation based on current liveness tracking
infrastructure without dramatic change.

  1. instruction level use->def chain

 - new active_defs array for each call frame to record which insn the active
   define of one register comes from. callee copies active_defs from caller
   for argument registers only, and copies active_def of R0 to caller when
   exit.

   struct bpf_func_state {
 ...
 s16 active_defs[__MAX_BPF_REG];

 - new use->def chains for each instruction. one eBPF insn could have two
   uses at maximum. also one new boolean "full_ref_def" added to keep the
   final 32-bit safety information. it will be true if this instruction
   needs to operate on full 64-bit.

   bpf_insn_aux_data {
   ...
 struct {
   s16 def[2];
   bool full_ref_def;
 };

  2. Inside mark_reg_read, split SRC_OP to SRC0_OP/SRC1_OP/SRC_OP_64/SRC1_OP_64
 to indicate one register read if from the first or second use slot of this
 instruction, and to indicate whether the read is on full 64-bit which will
 only be true if the read is for 64-bit write back to memory, or 64-bit
 comparison-then-jump, or bit right shift means 64-bit.

 Build use->def chain for any read, but do 64-bit backward propagation
 for 64-bit read only. The propagation is to set full_ref_def to true for
 def and parent defs of the 64-bit read.

   3. Need new bpf_reg_liveness enum, REG_LIVE_READ64 to indicate there is
  64-bit access to one register in the pruned path, and need
  REG_LIVE_WRITTEN64 to indicate a write that is REG_LIVE_WRITTEN but should
  not screen backward propagate 64-bit read info.

   #define REG_LIVE_NONE  0
   #define REG_LIVE_READ  BIT(0)
   #define REG_LIVE_READ64BIT(1)
   #define REG_LIVE_WRITTEN   BIT(2)
   #define REG_LIVE_WRITTEN64 BIT(3)

I might have missed something in above thoughts, would appreciate reviews and
suggestions. I will also send out some concrete patches later.

Regards,
Jiong



Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification

2018-09-28 Thread Edward Cree
On 26/09/18 23:16, Jiong Wang wrote:
> On 22/08/2018 20:00, Edward Cree wrote:
>> In the future this idea may be extended to form use-def chains.
>
>   1. instruction level use->def chain
>
>  - new use->def chains for each instruction. one eBPF insn could have two
>    uses at maximum.
I was thinking of something a lot weaker/simpler, just making
    ld rX, rY
 copy rY.parent into rX.parent and not read-mark rY (whereas actual
 arithmetic, pointer deref etc. would still create read marks).
But what you've described sounds interesting; perhaps it would also
 help later with loop-variable handling?

-Ed


Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification

2018-10-03 Thread Jiong Wang

On 28/09/2018 14:36, Edward Cree wrote:
> On 26/09/18 23:16, Jiong Wang wrote:
>> On 22/08/2018 20:00, Edward Cree wrote:
>>> In the future this idea may be extended to form use-def chains.
>>
>>   1. instruction level use->def chain
>>
>>  - new use->def chains for each instruction. one eBPF insn could 
have two

>>    uses at maximum.
> I was thinking of something a lot weaker/simpler, just making
> ld rX, rY
>  copy rY.parent into rX.parent and not read-mark rY (whereas actual
>  arithmetic, pointer deref etc. would still create read marks).

Thanks for the feedback Edward.

> But what you've described sounds interesting; perhaps it would also
>  help later with loop-variable handling?

Haven't considered how to use this for loop-variable handling, guess you 
mean
applying what I have described to your previous loop detection RFC? I 
will look

into your RFC later.

At the moment the design of the use->def chain is mainly to optimize 32-bit
code-gen. I was about to satisfied with a local implementation and to 
share it

to ML for further discussion. However, when manually check the optimization
result on testcase with medium size (~1000 eBPF insns) and proper complexity
(make sure path prunes etc are triggered inside verifier), I found the 
code-gen

doesn't meet my expectation.

For example, for the following sequence, insn at 25 should operate on 
full-64

bit but I found it is marked as 32-bit safe.

  25:    r7 = 1
  26:    if r4 > r8 goto +1200 
  27:    r1 = *(u8 *)(r1 + 0)
  28:    r1 &= 15
  29:    r7 = 1
  ...

L:
  1227:  r0 = r7
  1228:  exit

As described at previous email, the algorithm assume all insns are 
32-bit safe

first, then start to insns back to "64-bit" if there is any 64-bit use found
for a insn.

Insn 25 is defining r7 which is used at the 1227 where its value 
propagated to

r0 and then r0 is implicitly used at insn 1228 as it is a exit from main
function to external.

For above example, as we don't know the external use of r0 at 1228 (exit 
from

main to external), so r0 is treated as 64-bit implicit use. The define is at
1227, so insn 1227 is marked as "64-bit". The "64-bit" attribute should
propagate to source register operand through register move and 
arithmetic, so
r7 at insn 1227 is a "64-bit" use and should make its definition 
instruction,

insn 25, marked as "64-bit". This is my thinking of how insn 25 should be
marked.

Now this hasn't happened. I am still debugging the root cause, but kind 
of feel
"64-bit" attribute propagation is the issue, it seems to me it can't be 
nicely
integrated into the existing register read/write propagation 
infrastructure. For
example, for a slightly more complex sequence which is composed of three 
states:


State A
  ...
  10: r6 = *(u32 *)(r10 - 4)
  11: r7 = *(u32 *)(r10 - 8)
  12: *(u64 *)(r10 - 16) = r6
  13: *(u64 *)(r10 - 24) = r7

State B
  14: r6 += 1
  15: r7 += r6
  16: *(u32 *)(r10 - 28) = r7

State C
  ...
  17: r3 += r7
  18: r4 = 1
  19: *(u64 *)(r10 - 32) = r3
  20: *(u64 *)(r10 - 40) = r4

State A is parent of state B which is parent of state C.

Inside state C, at insn 20, r4 is a 64-bit read/use, so its define at 18 is
marked as "64-bit". There is no register source at 18, so "64-bit" attribute
propagation is stopped.

Then at insn 19, r3 is a 64-bit read/use, so its define at 17 is marked as
"64-bit" read/use. Insn 17 has two register sources, r3 and r7, they become
"64-bit" now, and their definition should be marked as "64-bit".

Now if the definition of r3 or r7 comes from parent state, then the 
parent state

should receive a "REG_LIVE_READ64", this is necessary if later another path
reaches state C and triggers prune path, for which case that path should 
know

there is "64-bit" use inside state C on some registers, and should use this
information to mark "64-bit" insn.

If the definition of r3 or r7 is still inside state C, we need to keep 
walking
up the instruction sequences, and propagate "64-bit" attribute upward 
until it

goes beyond the state C.

The above propagation logic is quite different from existing register 
read/write
propagation. For the latter, a write just screen up all following read, 
and a
read would propagate directly to its parent is there is not previous 
write, no

instruction analysis is required.

I am just describing what I have run into and trying to resolve, any 
thoughts

and suggestions are appreciated.

Regards,
Jiong


Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification

2018-10-03 Thread Alexei Starovoitov
On Wed, Oct 03, 2018 at 04:36:31PM +0100, Jiong Wang wrote:
> On 28/09/2018 14:36, Edward Cree wrote:
> > On 26/09/18 23:16, Jiong Wang wrote:
> >> On 22/08/2018 20:00, Edward Cree wrote:
> >>> In the future this idea may be extended to form use-def chains.
> >>
> >>   1. instruction level use->def chain
> >>
> >>  - new use->def chains for each instruction. one eBPF insn could have
> two
> >>    uses at maximum.
> > I was thinking of something a lot weaker/simpler, just making
> > ld rX, rY
> >  copy rY.parent into rX.parent and not read-mark rY (whereas actual
> >  arithmetic, pointer deref etc. would still create read marks).
> 
> Thanks for the feedback Edward.
> 
> > But what you've described sounds interesting; perhaps it would also
> >  help later with loop-variable handling?
> 
> Haven't considered how to use this for loop-variable handling, guess you
> mean
> applying what I have described to your previous loop detection RFC? I will
> look
> into your RFC later.
> 
> At the moment the design of the use->def chain is mainly to optimize 32-bit
> code-gen. I was about to satisfied with a local implementation and to share
> it
> to ML for further discussion. However, when manually check the optimization
> result on testcase with medium size (~1000 eBPF insns) and proper complexity
> (make sure path prunes etc are triggered inside verifier), I found the
> code-gen
> doesn't meet my expectation.
> 
> For example, for the following sequence, insn at 25 should operate on
> full-64
> bit but I found it is marked as 32-bit safe.
> 
>   25:    r7 = 1
>   26:    if r4 > r8 goto +1200 
>   27:    r1 = *(u8 *)(r1 + 0)
>   28:    r1 &= 15
>   29:    r7 = 1
>   ...
> 
> L:
>   1227:  r0 = r7
>   1228:  exit
> 
> As described at previous email, the algorithm assume all insns are 32-bit
> safe
> first, then start to insns back to "64-bit" if there is any 64-bit use found
> for a insn.
> 
> Insn 25 is defining r7 which is used at the 1227 where its value propagated
> to
> r0 and then r0 is implicitly used at insn 1228 as it is a exit from main
> function to external.
> 
> For above example, as we don't know the external use of r0 at 1228 (exit
> from
> main to external), so r0 is treated as 64-bit implicit use. The define is at
> 1227, so insn 1227 is marked as "64-bit". The "64-bit" attribute should
> propagate to source register operand through register move and arithmetic,
> so
> r7 at insn 1227 is a "64-bit" use and should make its definition
> instruction,
> insn 25, marked as "64-bit". This is my thinking of how insn 25 should be
> marked.

all makes sense to me.

> Now this hasn't happened. I am still debugging the root cause, but kind of
> feel
> "64-bit" attribute propagation is the issue, it seems to me it can't be
> nicely
> integrated into the existing register read/write propagation infrastructure.

may be share your patches that modify the liveness propagation?

> For
> example, for a slightly more complex sequence which is composed of three
> states:
> 
> State A
>   ...
>   10: r6 = *(u32 *)(r10 - 4)
>   11: r7 = *(u32 *)(r10 - 8)
>   12: *(u64 *)(r10 - 16) = r6
>   13: *(u64 *)(r10 - 24) = r7
> 
> State B
>   14: r6 += 1
>   15: r7 += r6
>   16: *(u32 *)(r10 - 28) = r7
> 
> State C
>   ...
>   17: r3 += r7
>   18: r4 = 1
>   19: *(u64 *)(r10 - 32) = r3
>   20: *(u64 *)(r10 - 40) = r4
> 
> State A is parent of state B which is parent of state C.
> 
> Inside state C, at insn 20, r4 is a 64-bit read/use, so its define at 18 is
> marked as "64-bit". There is no register source at 18, so "64-bit" attribute
> propagation is stopped.
> 
> Then at insn 19, r3 is a 64-bit read/use, so its define at 17 is marked as
> "64-bit" read/use. Insn 17 has two register sources, r3 and r7, they become
> "64-bit" now, and their definition should be marked as "64-bit".
> 
> Now if the definition of r3 or r7 comes from parent state, then the parent

... the definition of r3 _and_ r7 ...
both need to propagate up with your algo, right?

> state
> should receive a "REG_LIVE_READ64", this is necessary if later another path
> reaches state C and triggers prune path, for which case that path should
> know
> there is "64-bit" use inside state C on some registers, and should use this
> information to mark "64-bit" insn.
> 
> If the definition of r3 or r7 is still inside state C, we need to keep
> walking
> up the instruction sequences, and propagate "64-bit" attribute upward until
> it
> goes beyond the state C.
> 
> The above propagation logic is quite different from existing register
> read/write
> propagation.
> For the latter, a write just screen up all following read, and
> a
> read would propagate directly to its parent is there is not previous write,
> no
> instruction analysis is required.

correct.
with such algo REG_LIVE_WRITTEN shouldn't be screening the propagation.

I think the patches will discuss the algo.
Also I think the initial state of 'everything is 32-bit safe'
and make marks to enforce 64-bit-ne

Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification

2018-10-03 Thread Jiong Wang

On 03/10/2018 16:59, Alexei Starovoitov wrote:

On Wed, Oct 03, 2018 at 04:36:31PM +0100, Jiong Wang wrote:



Now this hasn't happened. I am still debugging the root cause, but kind of
feel
"64-bit" attribute propagation is the issue, it seems to me it can't be
nicely
integrated into the existing register read/write propagation infrastructure.


may be share your patches that modify the liveness propagation?


OK, I will share it after some clean up.


For
example, for a slightly more complex sequence which is composed of three
states:

State A
  ...
  10: r6 = *(u32 *)(r10 - 4)
  11: r7 = *(u32 *)(r10 - 8)
  12: *(u64 *)(r10 - 16) = r6
  13: *(u64 *)(r10 - 24) = r7

State B
  14: r6 += 1
  15: r7 += r6
  16: *(u32 *)(r10 - 28) = r7

State C
  ...
  17: r3 += r7
  18: r4 = 1
  19: *(u64 *)(r10 - 32) = r3
  20: *(u64 *)(r10 - 40) = r4

State A is parent of state B which is parent of state C.

Inside state C, at insn 20, r4 is a 64-bit read/use, so its define at 18 is
marked as "64-bit". There is no register source at 18, so "64-bit" attribute
propagation is stopped.

Then at insn 19, r3 is a 64-bit read/use, so its define at 17 is marked as
"64-bit" read/use. Insn 17 has two register sources, r3 and r7, they become
"64-bit" now, and their definition should be marked as "64-bit".

Now if the definition of r3 or r7 comes from parent state, then the parent


... the definition of r3 _and_ r7 ...
both need to propagate up with your algo, right?


Yes, all sources of insn 17, both r3 and r7.


state
should receive a "REG_LIVE_READ64", this is necessary if later another path
reaches state C and triggers prune path, for which case that path should
know
there is "64-bit" use inside state C on some registers, and should use this
information to mark "64-bit" insn.

If the definition of r3 or r7 is still inside state C, we need to keep
walking
up the instruction sequences, and propagate "64-bit" attribute upward until
it
goes beyond the state C.

The above propagation logic is quite different from existing register
read/write propagation.
For the latter, a write just screen up all following read, and a
read would propagate directly to its parent is there is not previous write,
no instruction analysis is required.


correct.
with such algo REG_LIVE_WRITTEN shouldn't be screening the propagation.

I think the patches will discuss the algo.
Also I think the initial state of 'everything is 32-bit safe'
and make marks to enforce 64-bit-ness is a dangerous algorithmic choice.


Indeed, and I am actually thinking the same thing ...


Why not to start at a safer state where everything is 64-bit
and work backward to find out which ones can be 32-bit?
That will be safer algo in case there are issues with bit like
you described in above.


... but I failed to find a algo works on making initial state of
"everything is 64-bit". I haven't figured out a way to check that all use of one
definition are 32-bit on all possible code paths, which looks to me is a must 
for
one insn marked as 32-bit safe.



Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification

2018-10-04 Thread Edward Cree
On 03/10/18 16:36, Jiong Wang wrote:
> On 28/09/2018 14:36, Edward Cree wrote:
> > But what you've described sounds interesting; perhaps it would also
> >  help later with loop-variable handling?
>
> Haven't considered how to use this for loop-variable handling, guess you mean
> applying what I have described to your previous loop detection RFC? I will 
> look
> into your RFC later.

Tbh I was thinking more of John Fastabend's version (I'm not sure if he ever
 got round to posting patches, but he discussed the design towards the end of
 https://www.mail-archive.com/netdev@vger.kernel.org/msg216285.html ) which
 is building 'proper compiler data structures' and thus might be interested
 in proper use-def chains.  (Or it might not; I'm not really a compiler-guru
 so it's not immediately obvious to me.)

My approach was much less interested in the 'provenance' of the induction
 variable, just that it was increasing appropriately, so use-def chains are
 not really relevant to it.

-Ed


Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification

2018-10-08 Thread Jiong Wang

On 03/10/2018 17:53, Jiong Wang wrote:

On 03/10/2018 16:59, Alexei Starovoitov wrote:

On Wed, Oct 03, 2018 at 04:36:31PM +0100, Jiong Wang wrote:



Now this hasn't happened. I am still debugging the root cause, but kind of
feel
"64-bit" attribute propagation is the issue, it seems to me it can't be
nicely
integrated into the existing register read/write propagation infrastructure.


may be share your patches that modify the liveness propagation?


OK, I will share it after some clean up.


Have done more experiments on the algo, and finally got something passed my
local tests. bpf selftest passed as well except several test_verifier failures
due to some corner cases I guess, will have a look later.

In these test, x86-64 is using the 32-bit information. In the insn loop inside
sanitize_dead_code, once an insn is not marked as 64-bit and if it is ALU64, it
will then be changed to ALU32. When enable tracking using printk, I could see
quite a few ALU64 instructions really are rewritten into ALU32, so tests like
test_l4lb runs OK looks like a positive signal of the correctness.

The algo separates the problem into two steps.

  - First, assume there is no code path prune and all code paths will be walked
through. In this case, if we could propagate 64-bit info backward along the
use-def chains for all paths walked, one insn must will be marked as 64-bit
correctly. Finish this step requires building use-def chain, and it is done
in the following way:

 1. each insn could have two explicit uses, so add two chain fields in
bpf_insn_aux_data.
 2. we need finer enum to describe register use, so split SRC_OP into
SRC_OP_0, SRC_OP64_0, SRC_OP_1, SRC_OP64_1 to indicate the source
is the first/second source and whether it is a 64-bit source.
 3. create the chain at check_reg_arg which is exactly covering all
register use sites. The function to create the chain is link_reg_to_def.
 4. when creating the chain, if a source is a 64-bit source, also
propagating the information backward along use-def chain straight away.
This is done in mark_reg_read which further calls the new function
"mark_insn_64bit" which is doing the real job. "mark_insn_64bit" fetches
the def insn for the 64-bit source, and further marks the def insns of
its sources as 64-bit. This will be repeated until the whole use-def
chain consumed.
 5. by use-def chain described above, if there is no code path prune, one
insn must have been marked as 64-bit when it's result has 64-bit use.
 6. helper call causing implicit reg use and must be conservative treated
as 64-bit use, bpf-to-bpf function call has insn connected by use-def
so doesn't need to make that conservative assumption.

  - Second, to handle code path prune, define new liveness enum
REG_LIVE_READ64 and REG_LIVE_UNIQUE_WRITTEN. The latter will only be
set if reg_arg_type is the new U_DST_OP or U_DST_OP_NO_MARK, and
REG_LIVE_READ64 will be set if one 64-bit read is not screened off by
REG_LIVE_UNIQUE_WRITTEN.

The thought is 64-bit use info will only be screened off if the dst register
is unique in all register operands, meaning not the same as any source. For
example, insn 18 below will screen off r4 64-bit propagation.

  17: r3 += r7
  18: r4 = 1
  19: *(u64 *)(r10 - 32) = r3
  20: *(u64 *)(r10 - 40) = r4

So, U_DST_OP/U_DST_OP_NO_MARK have been introduced to differentiate with
DST_OP/DST_OP_NO_MARK. Inside check_reg_arg, checks are done on dst_reg,
and would pass U_DST_* as reg_arg_type when it is unique. U_DST_* then
will set liveness to REG_LIVE_UNIQUE_WRITTEN. In side mark_reg_read, if
one 64-bit read is not screened off by REG_LIVE_UNIQUE_WRITTEN, then
REG_LIVE_READ64 will be set in the reg_state. REG_LIVE_READ64 further
triggers propagating downstream 64-bit uses from the pruned paths into the
current path inside propagate_liveness when path prune happened.

Compared with propagating REG_LIVE_READ, propagating REG_LIVE_READ64 needs
more work. Because one 64-bit read could indicating more than one registers
are 64-bit. For example, insn 19 above shows r3 is 64-bit source, so its
define at insn 17 are 64-bit, then all sources of insn 17 must be 64-bit,
meaning both r3 and r7 are 64-bit. Therefore, REG_LIVE_READ64 needs to be
propagated on both r3 and r7 upward along the register parentage chain.
During walking def-use chain, we record all such affected reg_state, and
propagate REG_LIVE_READ64 for all of them. This logic is done inside
mark_insn_64bit as well.

  - For short, the implementation treating the new 64-bit info (REG_LIVE_READ64)
propagation the same as REG_LIVE_READ propagation.

REG_LIVE_READ triggers more path prune during state equal comparison, while
REG_LIVE_READ64 triggers 64-bit insn marking during def-use c

Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification

2018-08-29 Thread Alexei Starovoitov
On Wed, Aug 22, 2018 at 08:00:46PM +0100, Edward Cree wrote:
> The first patch is a simplification of register liveness tracking by using
>  a separate parentage chain for each register and stack slot, thus avoiding
>  the need for logic to handle callee-saved registers when applying read
>  marks.  In the future this idea may be extended to form use-def chains.
> The second patch adds information about misc/zero data on the stack to the
>  state dumps emitted to the log at various points; this information was
>  found essential in debugging the first patch, and may be useful elsewhere.

I think this set is a great improvement in liveness tracking,
so depsite seeing the issues I applied it to bpf-next.

I think it's a better base to continue debugging.
In particular:
1. we have instability issue in the verifier.
 from time to time the verifier goes to process extra 7 instructions on one
 of the cilium tests. This was happening before and after this set.
2. there is a nice improvement in number of processed insns with this set,
 but the difference I cannot explain, hence it has to debugged.
 In theory the liveness rewrite shouldn't cause the difference in processed 
insns.

If not for the issue 1 I would argue that the issue 2 means that the set has to
be debugged before going in, but since the verifier is unstable it's better
to debug from this base with this patch set applied (because it greatly
simplifies liveness and adds additional debug in patch2)
and once we figure out the issue 1, I hope, the issue 2 will be resolved 
automatically.

The numbers on cilium bpf programs:
  before1before2   after1  after2
bpf_lb-DLB_L3.o 2003  2003  20032003
bpf_lb-DLB_L4.o 3173  3173  31733173
bpf_lb-DUNKNOWN.o   1080  1080  10801080
bpf_lxc-DDROP_ALL.o 29587 29587 29587   29587
bpf_lxc-DUNKNOWN.o  37204 37211 36926   36933
bpf_netdev.o11283 11283 11188   11188
bpf_overlay.o   6679  6679  66796679
bpf_lcx_jit.o   39657 39657 39561   39561

notice how bpf_lxc-DUNKNOWN.o fluctuates with +7 before and after. That is 
issue 1.
bpf_lxc-DUNKNOWN.o, bpf_netdev.o, and bpf_lcx_jit.o have small improvements.
That is issue 2.

To reproduce above numbers clone this repo: 
https://github.com/4ast/bpf_cilium_test
and run .sh. The .o files in there are pretty old cilium bpf programs.
I kept them frozen and didn't recompile for more than a year to keep stable
base line and track the progress of the verifier in 'processed insns'.

Thanks



Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification

2018-08-31 Thread Edward Cree
On 30/08/18 03:18, Alexei Starovoitov wrote:
> I think it's a better base to continue debugging.
> In particular:
> 1. we have instability issue in the verifier.
>  from time to time the verifier goes to process extra 7 instructions on one
>  of the cilium tests. This was happening before and after this set.
I can't reproduce this; I always get 36926.  Can you try recording the
 verifier log and diff the output between the two cases?
> 2. there is a nice improvement in number of processed insns with this set,
>  but the difference I cannot explain, hence it has to debugged.
>  In theory the liveness rewrite shouldn't cause the difference in processed 
> insns.
I shall attack this with the same methods I used for the other delta.
 Since that one turned out to be a real bug in the patch, I'm not so
 sanguine as to dismiss this one as probably connected to issue 1.

-Ed