On Thu, Oct 12, 2017 at 08:32:32AM +0200, Uros Bizjak wrote:
> On Wed, Oct 11, 2017 at 10:59 PM, Jakub Jelinek <ja...@redhat.com> wrote:
> > As can be seen on the testcase below, the *<rotate_insn><mode>3_mask
> > insn/splitter is able to optimize only the case when the and is
> > performed in SImode and then the result subreged into QImode,
> > while if the computation is already in QImode, we don't handle it.
> >
> > Fixed by adding another pattern, bootstrapped/regtested on x86_64-linux and
> > i686-linux, ok for trunk?
> 
> We probably want to add this variant to *all* *_mask splitters (there
> are a few of them in i386.md, please grep for "Avoid useless

Well, not all of them, just the 5 ones that have (subreg:QI (and:SI ...)),
*jcc_bt<mode>_mask expects SImode argument, so doesn't have this kind of
problem.

> masking"). Which finally begs a question - should we implement this
> simplification in a generic, target-independent way? OTOH, we already

The target independent way is SHIFT_COUNT_TRUNCATED, but that isn't
applicable to targets which aren't consistent in their behavior across
the whole ISA.  x86 has some instructions that truncate the mask and others
that saturate and others (b* with memory operand) that use something
still different.
So we need to apply it only to the instructions which really truncate
the shift counts and the best infrastructure for that I'm aware is the
combiner.

In order to have just one set of the *_mask patterns we'd need to
change simplify_truncation to canonicalize
(subreg:QI (and:SI (something:SI) (const_int N)) low)
into
(and:QI (subreg:QI (something:SI) low) (const_int lowN))
and change all these patterns; but I'm afraid that is going to break almost
all WORD_REGISTER_OPERATIONS targets and various others, those actually
perform all the operations in word mode and that is why the first form
is actually preferred for them.  Except that if something is already
QImode there is no need for the subreg at all, which is why we I'm afraid
need the two sets of mask patterns.

So, if you aren't against it, I can extend the patch to handle the 4
other mask patterns; as for other modes, SImode is what is being handled
already, DImode is not a problem, because the FEs truncate the shift counts
to integer_type_node already, and for HImode I haven't seen problem
probably because most tunings avoid HImode math and so it isn't worth
optimizing.

> have SHIFT_COUNT_TRUNCATED and shift_truncation_mask hooks, but last
> time I try the former, there were some problems in the testsuite on
> x86. I guess there are several targets that would benefit from
> removing useless masking of count operands.
> 
> Uros.
> 
> > 2017-10-11  Jakub Jelinek  <ja...@redhat.com>
> >
> >         PR target/82498
> >         * config/i386/i386.md (*<rotate_insn><mode>3_mask_1): New
> >         define_insn_and_split.
> >
> >         * gcc.target/i386/pr82498.c: New test.

        Jakub

Reply via email to