After some more digging and adjusting I found additional cases that are
optimizing out registers
thus I decided to continue this thread to keep discussion compact.
With some changes simplified implementation of my expansion is as follows:
tmp_op0 = gen_reg_rtx (mode);
emit_move_insn (tmp_op0, op0);
tmp_op1 = gen_reg_rtx (mode);
emit_move_insn (tmp_op1, op1);
// This is important part
reg = gen_rtx_REG(wide_mode, XMM2_REG);
emit_insn (gen_rtx_SET (reg, tmp_op1));
emit_insn (gen_myinsn(op2, reg));
emit_insn (gen_rtx_SET (tmp_op0, reg));
////
And my md is as follows:
(define_insn "myinsn"
[(unspec [(match_operand:SI 0 "register_operand" "r")
(match_operand:V4SI 1 "vector_operand")]
UNSPEC_MYINSN)
(clobber (reg:V4SI XMM2_REG))]
"TARGET_MYTARGET"
"instr\t%0"
[(set_attr "type" "other")])
This is working like a charm when built with any optimization level producing
something like this:
movdqu %eax, %xmm2
instr %edx
movups %xmm2, %eax
Unfortunately, when I build it with additional -mavx2 or -mavx512f first move
(from reg to xmm2) is
optimized out. I'm using those extra flags because I also want to use YMM2 and
ZMM2 in my instruction.
Does anyone have idea why might such thing happen? And how this can be overcome?
Thanks,
Sebastian
> -----Original Message-----
> Subject: Re: Question regarding preventing optimizing out of register in
> expansion
>
> On 06/21/2018 05:20 AM, Peryt, Sebastian wrote:
> > Hi,
> >
> > I'd appreciate if someone could advise me in builtin expansion I'm currently
> writing.
> >
> > High level description for what I want to do:
> >
> > I have 2 operands in my builtin.
>
> IIUC you're defining an UNSPEC.
>
> > First I set register (reg1) with value from operand1 (op1); Second I
> > call my instruction (reg1 is called implicitly and updated);
>
> Here is your error -- NEVER have implicit register settings. The data flow
> analysers need accurate information.
>
>
> > Simplified implementation in i386.c I have:
> >
> > reg1 = gen_reg_rtx (mode);
> > emit_insn (gen_rtx_SET (reg1, op1);
> > emit_clobber (reg1);
>
> At this point reg1 is dead. That means the previous set of reg1 from
> op1 is unneeded and can be deleted.
>
> > emit_insn (gen_myinstruction ());
>
> This instruction has no inputs or outputs, and is not marked volatile(?)
> so can be deleted.
>
> > emit_insn (gen_rtx_SET (op2,reg1));
>
> And this is storing a value from a dead register.
>
> You need something like:
> rtx reg1 = force_reg (op1);
> rtx reg2 = gen_reg_rtx (mode);
> emit_insn (gen_my_insn (reg2, reg1));
> emit insn (gen_rtx_SET (op2, reg2));
>
> your instruction should be an UNSPEC showing what the inputs and outputs
> are. That tells the optimizers what depends on what, but the compiler
> has no clue about what the transform is.
>
> nathan
> --
> Nathan Sidwell