yonghong-song added inline comments.
================ Comment at: llvm/lib/Target/BPF/BPFInstrInfo.td:379 + "$dst = (s8)$src", + [(set GPR:$dst, (sra (shl GPR:$src, (i64 56)), (i64 56)))]>; + def MOVSX_rr_16 : ALU_RR<BPF_ALU64, BPF_MOV, 16, ---------------- eddyz87 wrote: > I think it is possible to avoid matching expansion pattern `(sra (shl > GPR:$src, (i64 56))` here, and instead turn off the expansion when `movsx` is > available. > > I tried the change below and all BPF codegen tests are passing. Do I miss > something? > > --- > > ``` > diff --git a/llvm/lib/Target/BPF/BPFISelLowering.cpp > b/llvm/lib/Target/BPF/BPFISelLowering.cpp > index 9a7357d6ad04..5e84af009591 100644 > --- a/llvm/lib/Target/BPF/BPFISelLowering.cpp > +++ b/llvm/lib/Target/BPF/BPFISelLowering.cpp > @@ -132,9 +132,11 @@ BPFTargetLowering::BPFTargetLowering(const TargetMachine > &TM, > setOperationAction(ISD::CTLZ_ZERO_UNDEF, MVT::i64, Custom); > > setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i1, Expand); > - setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8, Expand); > - setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16, Expand); > - setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i32, Expand); > + if (!STI.hasMovsx()) { > + setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8, Expand); > + setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16, Expand); > + setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i32, Expand); > + } > > // Extended load operations for i1 types must be promoted > for (MVT VT : MVT::integer_valuetypes()) { > diff --git a/llvm/lib/Target/BPF/BPFInstrInfo.td > b/llvm/lib/Target/BPF/BPFInstrInfo.td > index a1d532e60db2..29bec72aa92d 100644 > --- a/llvm/lib/Target/BPF/BPFInstrInfo.td > +++ b/llvm/lib/Target/BPF/BPFInstrInfo.td > @@ -376,11 +376,11 @@ let Predicates = [BPFHasMovsx] in { > def MOVSX_rr_8 : ALU_RR<BPF_ALU64, BPF_MOV, 8, > (outs GPR:$dst), (ins GPR:$src), > "$dst = (s8)$src", > - [(set GPR:$dst, (sra (shl GPR:$src, (i64 56)), (i64 > 56)))]>; > + [(set GPR:$dst, (sext_inreg GPR:$src, i8))]>; > def MOVSX_rr_16 : ALU_RR<BPF_ALU64, BPF_MOV, 16, > (outs GPR:$dst), (ins GPR:$src), > "$dst = (s16)$src", > - [(set GPR:$dst, (sra (shl GPR:$src, (i64 48)), (i64 > 48)))]>; > + [(set GPR:$dst, (sext_inreg GPR:$src, i16))]>; > def MOVSX_rr_32 : ALU_RR<BPF_ALU64, BPF_MOV, 32, > (outs GPR:$dst), (ins GPR32:$src), > "$dst = (s32)$src", > @@ -388,11 +388,11 @@ let Predicates = [BPFHasMovsx] in { > def MOVSX_rr_32_8 : ALU_RR<BPF_ALU, BPF_MOV, 8, > (outs GPR32:$dst), (ins GPR32:$src), > "$dst = (s8)$src", > - [(set GPR32:$dst, (sra (shl GPR32:$src, (i32 24)), > (i32 24)))]>; > + [(set GPR32:$dst, (sext_inreg GPR32:$src, i8))]>; > def MOVSX_rr_32_16 : ALU_RR<BPF_ALU, BPF_MOV, 16, > (outs GPR32:$dst), (ins GPR32:$src), > "$dst = (s16)$src", > - [(set GPR32:$dst, (sra (shl GPR32:$src, (i32 16)), > (i32 16)))]>; > + [(set GPR32:$dst, (sext_inreg GPR32:$src, i16))]>; > } > } > ``` This indeed can simplify the code. I will incorporate your change into the patch. Thanks! ================ Comment at: llvm/lib/Target/BPF/BPFMIPeephole.cpp:321 + + std::map<unsigned, unsigned> ReverseCondOpMap; ---------------- eddyz87 wrote: > Is this map unused? No. This is a leftover. Will remove. ================ Comment at: llvm/lib/Target/BPF/BPFMIPeephole.cpp:412 + int CurrNumInsns = 0; + std::map<MachineBasicBlock *, int> SoFarNumInsns; + std::map<MachineBasicBlock *, MachineBasicBlock *> FollowThroughBB; ---------------- eddyz87 wrote: > Nitpick: Fangrui suggested in my llvm-objdump revisions to use `DenseMap` in > most cases (as `std::map` allocates for each pair). Will try to use DenseMap. ================ Comment at: llvm/test/CodeGen/BPF/movsx.ll:30 +} +; CHECK: w0 = w1 # encoding: [0xbc,0x10,0x00,0x00,0x00,0x00,0x00,0x00] + ---------------- eddyz87 wrote: > This does not seem right, as it does not sign extend 8-bit argument to 16-bit > value. This is probably due to ABI. For example, ``` $ cat t1.c __attribute__((noinline)) short f1(char a) { return a * a; } int f2(int a) { return f1(a); } $ clang --target=bpf -O2 -mcpu=v4 -S t1.c f1: # @f1 # %bb.0: # %entry w0 = w1 w0 *= w0 exit .Lfunc_end0: .size f1, .Lfunc_end0-f1 # -- End function .globl f2 # -- Begin function f2 .p2align 3 .type f2,@function f2: # @f2 # %bb.0: # %entry w1 = (s8)w1 call f1 w0 = (s16)w0 exit ``` You can see in function f2(), the sign-extension has been done properly. and that is probably the reason in f1(), the compiler didn't generate proper sign extension code. I will modify the test to generate proper sign extension like the above f2(). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D144829/new/ https://reviews.llvm.org/D144829 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits