On Wed, Jan 13, 2016 at 7:39 AM, Jeff Law <l...@redhat.com> wrote:
> On 01/12/2016 08:11 AM, Richard Biener wrote:
>>
>> On Tue, Jan 12, 2016 at 6:10 AM, Jeff Law <l...@redhat.com> wrote:
>>>
>>> On 01/11/2016 03:32 AM, Richard Biener wrote:
>>>
>>>>
>>>> Yeah, reassoc is largely about canonicalization.
>>>>
>>>>> Plus doing it in TER is almost certainly more complex than getting it
>>>>> right
>>>>> in reassoc to begin with.
>>>>
>>>>
>>>>
>>>> I guess canonicalizing differently is ok but you'll still create
>>>> ((a & b) & 1) & c then if you only change the above place.
>>>
>>>
>>> What's best for that expression would depend on factors like whether or
>>> not
>>> the target can exploit ILP.  ie (a & b) & (1 & c) exposes more
>>> parallelism
>>> while (((a & b) & c) & 1) is not good for parallelism, but does expose
>>> the
>>> bit test.
>>>
>>> reassoc currently generates ((a & 1) & b) & c which is dreadful as
>>> there's
>>> no ILP or chance of creating a bit test.  My patch shuffles things
>>> around,
>>> but still doesn't expose the ILP or bit test in the 4 operand case.
>>> Based
>>> on the comments in reassoc, it didn't seem like the author thought
>>> anything
>>> beyond the 3-operand case was worth handling. So my patch just handles
>>> the
>>> 3-operand case.
>>>
>>>
>>>
>>>>
>>>> So I'm not sure what pattern the backend is looking for?
>>>
>>>
>>> It just wants the constant last in the sequence.  That exposes bit clear,
>>> set, flip, test, etc idioms.
>>
>>
>> But those don't feed another bit operation, right?  Thus we'd like to see
>> ((a & b) & c) & 1, not ((a & b) & 1) & c?  It sounds like the instructions
>> are designed to feed conditionals (aka CC consuming ops)?
>
> At the gimple level they could feed a conditional, or be part of a series of
> ops on an SSA_NAME that eventually gets stored to memory, etc.  At the RTL
> level they'll feed CC consumers and bit manipulations of pseudos or memory.
>
> For the 3-op case, we always want the constant last.  For the 4-op case it's
> less clear.  Though ((a & b) & c) & 1 is certainly better than ((a & b) & 1)
> & c.

Ok, so handling it in swap_ops_for_binary_stmt is merely a convenient place
to special-case bitwise ops.  The "real" fix to the sorting heuristic would be
to sort constants at the opposite end.

That might be too invasive right now but there is another "convenient" place:

              /* If the operand vector is now empty, all operands were
                 consumed by the __builtin_powi optimization.  */
...
              else
                {
                  machine_mode mode = TYPE_MODE (TREE_TYPE (lhs));
                  int ops_num = ops.length ();
                  int width = get_reassociation_width (ops_num, rhs_code, mode);
                  tree new_lhs = lhs;

                  if (dump_file && (dump_flags & TDF_DETAILS))
                    fprintf (dump_file,
                             "Width = %d was chosen for
reassociation\n", width);

at this point you can check rhs_code and move the (single) constant
entry in ops (if there is any constant entry) from .last () to the beginning.

That'll get the 4 operand case correct as well and properly models
"constant last" for the specified operation kind.

Richard.


> Jeff

Reply via email to