Re: [PATCH net-next] bpf/verifier: improve disassembly of BPF_END instructions

Alexei Starovoitov Fri, 22 Sep 2017 08:16:38 -0700

On Fri, Sep 22, 2017 at 07:27:29AM -0700, Y Song wrote:
> On Fri, Sep 22, 2017 at 7:11 AM, Y Song <[email protected]> wrote:
> > On Fri, Sep 22, 2017 at 6:46 AM, Edward Cree <[email protected]> wrote:
> >> On 22/09/17 00:11, Y Song wrote:
> >>> On Thu, Sep 21, 2017 at 12:58 PM, Edward Cree <[email protected]> 
> >>> wrote:
> >>>> On 21/09/17 20:44, Alexei Starovoitov wrote:
> >>>>> On Thu, Sep 21, 2017 at 09:29:33PM +0200, Daniel Borkmann wrote:
> >>>>>> More intuitive, but agree on the from_be/le. Maybe we should
> >>>>>> just drop the "to_" prefix altogether, and leave the rest as is since
> >>>>>> it's not surrounded by braces, it's also not a cast but rather an op.
> >>>> That works for me.
> >>>>> 'be16 r4' is ambiguous regarding upper bits.
> >>>>>
> >>>>> what about my earlier suggestion:
> >>>>> r4 = (be16) (u16) r4
> >>>>> r4 = (le64) (u64) r4
> >>>>>
> >>>>> It will be pretty clear what instruction is doing (that upper bits 
> >>>>> become zero).
> >>>> Trouble with that is that's very *not* what C will do with those casts
> >>>>  and it doesn't really capture the bidirectional/symmetry thing.  The
> >>>>  closest I could see with that is something like `r4 = (be16/u16) r4`,
> >>>>  but that's quite an ugly mongrel.
> >>>> I think Daniel's idea of `be16`, `le32` etc one-arg opcodes is the
> >>>>  cleanest and clearest.  Should it be
> >>>>     r4 = be16 r4
> >>>>  or just
> >>>>     be16 r4
> >>>> ?  Personally I incline towards the latter, but admit it doesn't really
> >>>>  match the syntax of other opcodes.
> >>> I did some quick prototyping in llvm to make sure we have a syntax
> >>> llvm is happy. Apparently, llvm does not like the syntax
> >>>    r4 = be16 r4    or    r4 = (be16) (u16) r4.
> >>>
> >>> In llvm:utils/TableGen/AsmMatcherEmitter.cpp:
> >>>
> >>>     // Verify that any operand is only mentioned once.
> >> Wait, how do you deal with (totally legal) r4 += r4?
> >> Or r4 = *(r4 +0)?
> >> Even jumps can have src_reg == dst_reg, though it doesn't seem useful.
> >
> > We are talking about dag node here. The above "r4", although using the same
> > register, will be different dag nodes. So it will be okay.
> >
> > The "r4 = be16 r4" tries to use the *same* dag node as both source and
> > destination
> > in the asm output which is prohibited.
> 
> With second thought, we may allow "r4 = be16 r4" by using different dag nodes.
> (I need to do experiment for this.) But we do have constraints that
> the two "r4" must
> be the same register.  "r5 = be16 r4"  is not allowed. So from that
> perspective, referencing
> "r4" only once is a good idea and less confusing.


looks like we're converging on
"be16/be32/be64/le16/le32/le64 #register" for BPF_END.
I guess it can live with that. I would prefer more C like syntax
to match the rest, but llvm parsing point is a strong one.

For BPG_NEG I prefer to do it in C syntax like interpreter does:
        ALU_NEG:
                DST = (u32) -DST;
        ALU64_NEG:
                DST = -DST;
Yonghong, does it mean that asmparser will equally suffer?

Re: [PATCH net-next] bpf/verifier: improve disassembly of BPF_END instructions

Reply via email to