Hi, I'd need some help with the following optimization issue:
avr backend supports insns for bit insertion, and insn combiner tries to
use them:
unsigned char bset (unsigned char a, unsigned char n)
{
return (a & ~0x40) | (n & 0x40);
}
Trying 7 -> 14:
Successfully matched this instruction:
(set (zero_extract:QI (reg/i:QI 24 r24)
(const_int 1 [0x1])
(const_int 6 [0x6]))
(lshiftrt:QI (reg:QI 52)
(const_int 6 [0x6])))
rejecting combination of insns 7 and 14
original costs 4 + 4 = 8
replacement cost 24
Hence the existing insn is rejected because of too high costs.
The problem is that the backend only sees
avr_rtx_costs[bset:combine(266)]=true (size) total=24, outer=set:
(lshiftrt:QI (reg:QI 52)
(const_int 6 [0x6]))
Hence this looks like a QI shift as the ZERO_EXTRACT is killed, only the
outer SET is available which is not very helpful.
A shift is actually more expensive than a bit insertion.
How can I fix that?
What I'd like to avoid is to write hell of many complicated patterns
like for:
Trying 8, 7 -> 9:
Failed to match this instruction:
(set (reg:QI 50)
(ior:QI (and:QI (reg/v:QI 49 [ n ])
(const_int 64 [0x40]))
(and:QI (reg:QI 24 r24 [ a ])
(const_int -65 [0xffffffffffffffbf]))))
This would be a different representation of bit insertion, but it would
also need many patterns:
* Ones for same bit number (like in the example)
* Ones where the src bit is smaller than the dest bit (needs ASHIFT).
* Ones where the src bit is greater than the dest bit (needs LSHIFTRT).
* Ones where the MSB has to be inserted (will use other canonical form)
* Ones where the LSB has to be inserted (will use other canonical form)
* ... you name it.
Any ideas for a sane approach?
Thanks,
Johann