On Thu, Oct 12, 2017 at 08:32:32AM +0200, Uros Bizjak wrote: > On Wed, Oct 11, 2017 at 10:59 PM, Jakub Jelinek <ja...@redhat.com> wrote: > > As can be seen on the testcase below, the *<rotate_insn><mode>3_mask > > insn/splitter is able to optimize only the case when the and is > > performed in SImode and then the result subreged into QImode, > > while if the computation is already in QImode, we don't handle it. > > > > Fixed by adding another pattern, bootstrapped/regtested on x86_64-linux and > > i686-linux, ok for trunk? > > We probably want to add this variant to *all* *_mask splitters (there > are a few of them in i386.md, please grep for "Avoid useless
Well, not all of them, just the 5 ones that have (subreg:QI (and:SI ...)), *jcc_bt<mode>_mask expects SImode argument, so doesn't have this kind of problem. > masking"). Which finally begs a question - should we implement this > simplification in a generic, target-independent way? OTOH, we already The target independent way is SHIFT_COUNT_TRUNCATED, but that isn't applicable to targets which aren't consistent in their behavior across the whole ISA. x86 has some instructions that truncate the mask and others that saturate and others (b* with memory operand) that use something still different. So we need to apply it only to the instructions which really truncate the shift counts and the best infrastructure for that I'm aware is the combiner. In order to have just one set of the *_mask patterns we'd need to change simplify_truncation to canonicalize (subreg:QI (and:SI (something:SI) (const_int N)) low) into (and:QI (subreg:QI (something:SI) low) (const_int lowN)) and change all these patterns; but I'm afraid that is going to break almost all WORD_REGISTER_OPERATIONS targets and various others, those actually perform all the operations in word mode and that is why the first form is actually preferred for them. Except that if something is already QImode there is no need for the subreg at all, which is why we I'm afraid need the two sets of mask patterns. So, if you aren't against it, I can extend the patch to handle the 4 other mask patterns; as for other modes, SImode is what is being handled already, DImode is not a problem, because the FEs truncate the shift counts to integer_type_node already, and for HImode I haven't seen problem probably because most tunings avoid HImode math and so it isn't worth optimizing. > have SHIFT_COUNT_TRUNCATED and shift_truncation_mask hooks, but last > time I try the former, there were some problems in the testsuite on > x86. I guess there are several targets that would benefit from > removing useless masking of count operands. > > Uros. > > > 2017-10-11 Jakub Jelinek <ja...@redhat.com> > > > > PR target/82498 > > * config/i386/i386.md (*<rotate_insn><mode>3_mask_1): New > > define_insn_and_split. > > > > * gcc.target/i386/pr82498.c: New test. Jakub