Re: Reverting r278411

2019-11-20 Thread Joseph Myers
On Wed, 20 Nov 2019, Jakub Jelinek wrote:

> On Wed, Nov 20, 2019 at 10:17:38AM -0600, Segher Boessenkool wrote:
> > On Wed, Nov 20, 2019 at 03:46:29PM +, Richard Sandiford wrote:
> > > Segher Boessenkool  writes:
> > > > UNLT & ORDERED is always LT.  When would it not be true?
> > > 
> > > LT traps on quiet NaNs for -ftrapping-math, UNLT and ORDERED don't.
> > 
> > No?  -ftrapping-math makes nothing trap.  The only thing it does is to
> > not do optimisations that are not valid if traps are considered to be
> > a user-visible thing.
> > 
> > Almost nothing ever traps on quiet NaNs.
> 
> A lot traps even with quiet NaNs, assuming exceptions are enabled.
> E.g. for x86 https://www.felixcloutier.com/x86/cmppd#tbl-3-1 lists the
> details which operations rise FE_INVALID and which don't if it is enabled.

The general IEEE rule is *operations with a non-floating-point result*.  
So comparisons (other than quiet ones) and conversions to integer types.  
(And the special case that it's unspecified whether fma (0, Inf, qNaN) 
traps or not.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Reverting r278411

2019-11-20 Thread Segher Boessenkool
On Wed, Nov 20, 2019 at 04:59:49PM +, Richard Sandiford wrote:
> Segher Boessenkool  writes:
> > On Wed, Nov 20, 2019 at 05:30:48PM +0100, Jakub Jelinek wrote:
> >> On Wed, Nov 20, 2019 at 10:17:38AM -0600, Segher Boessenkool wrote:
> >> > On Wed, Nov 20, 2019 at 03:46:29PM +, Richard Sandiford wrote:
> >> > > Segher Boessenkool  writes:
> >> > > > UNLT & ORDERED is always LT.  When would it not be true?
> >> > > 
> >> > > LT traps on quiet NaNs for -ftrapping-math, UNLT and ORDERED don't.
> >> > 
> >> > No?  -ftrapping-math makes nothing trap.  The only thing it does is to
> >> > not do optimisations that are not valid if traps are considered to be
> >> > a user-visible thing.
> >> > 
> >> > Almost nothing ever traps on quiet NaNs.
> >> 
> >> A lot traps even with quiet NaNs, assuming exceptions are enabled.
> >> E.g. for x86 https://www.felixcloutier.com/x86/cmppd#tbl-3-1 lists the
> >> details which operations rise FE_INVALID and which don't if it is enabled.
> >
> > That is what ordered comparisons (aka signaling comparisons) do, sure.
> > This is part of "almost nothing" in my count ;-)
> >
> > Ordered comparisons should trap both with and without -ftrapping-math.
> > The difference is that with -fno-trapping math GCC can ignore that and
> > just optimise code how it wants to.
> 
> Sure, no-one was disputing that.  I think you're arguing against
> a strawman here.

No, I don't understand the question I guess?  Everything in that table
that traps on quiet NaNs is a signaling comparison.

We should simply not do simplify_logical_relational_operation for
floating point types and signaling comparisons.  There is no "only for
some codes", etc.


Segher


Re: Reverting r278411

2019-11-20 Thread Segher Boessenkool
On Wed, Nov 20, 2019 at 04:35:24PM +, Richard Sandiford wrote:
> Actually, this doesn't work because *_operators want rtxes rather
> than codes.  I can get around that by passing op0 and op1 for
> the existing rtxes.  For the conversion at the end, I can do:
> 
>   machine_mode compared_mode = GET_MODE (XEXP (op0, 0));
> 
>   if (code == ORDERED && INTEGRAL_MODE_P (compared_mode))
> return const_true_rtx;

This should be all !HONOR_NANS?  Also LTGT should be turned into NE,
under that same condition.  So something like

  if (!HONOR_NANS (mode))
{
  /* UNORDERED cannot happen without NaNs.  */
  mask &= ~1;

  /* LTGT is written as NE, and ORDERED just is always true,
 without NaNs.  */
  if (mask == 12 || mask == 14)
mask |= 1;
}

before returning true for 15.

>   if (is_unsigned)
> code = unsigned_condition (code);
> 
> Or I can add signed_comparison_p and unsigned_comparison_p functions
> that take codes instead of rtxes.

That may be easier, dunno.

Thanks,


Segher


Re: Reverting r278411

2019-11-20 Thread Richard Sandiford
Segher Boessenkool  writes:
> On Wed, Nov 20, 2019 at 05:30:48PM +0100, Jakub Jelinek wrote:
>> On Wed, Nov 20, 2019 at 10:17:38AM -0600, Segher Boessenkool wrote:
>> > On Wed, Nov 20, 2019 at 03:46:29PM +, Richard Sandiford wrote:
>> > > Segher Boessenkool  writes:
>> > > > UNLT & ORDERED is always LT.  When would it not be true?
>> > > 
>> > > LT traps on quiet NaNs for -ftrapping-math, UNLT and ORDERED don't.
>> > 
>> > No?  -ftrapping-math makes nothing trap.  The only thing it does is to
>> > not do optimisations that are not valid if traps are considered to be
>> > a user-visible thing.
>> > 
>> > Almost nothing ever traps on quiet NaNs.
>> 
>> A lot traps even with quiet NaNs, assuming exceptions are enabled.
>> E.g. for x86 https://www.felixcloutier.com/x86/cmppd#tbl-3-1 lists the
>> details which operations rise FE_INVALID and which don't if it is enabled.
>
> That is what ordered comparisons (aka signaling comparisons) do, sure.
> This is part of "almost nothing" in my count ;-)
>
> Ordered comparisons should trap both with and without -ftrapping-math.
> The difference is that with -fno-trapping math GCC can ignore that and
> just optimise code how it wants to.

Sure, no-one was disputing that.  I think you're arguing against
a strawman here.

Richard


Re: Reverting r278411

2019-11-20 Thread Segher Boessenkool
On Wed, Nov 20, 2019 at 05:30:48PM +0100, Jakub Jelinek wrote:
> On Wed, Nov 20, 2019 at 10:17:38AM -0600, Segher Boessenkool wrote:
> > On Wed, Nov 20, 2019 at 03:46:29PM +, Richard Sandiford wrote:
> > > Segher Boessenkool  writes:
> > > > UNLT & ORDERED is always LT.  When would it not be true?
> > > 
> > > LT traps on quiet NaNs for -ftrapping-math, UNLT and ORDERED don't.
> > 
> > No?  -ftrapping-math makes nothing trap.  The only thing it does is to
> > not do optimisations that are not valid if traps are considered to be
> > a user-visible thing.
> > 
> > Almost nothing ever traps on quiet NaNs.
> 
> A lot traps even with quiet NaNs, assuming exceptions are enabled.
> E.g. for x86 https://www.felixcloutier.com/x86/cmppd#tbl-3-1 lists the
> details which operations rise FE_INVALID and which don't if it is enabled.

That is what ordered comparisons (aka signaling comparisons) do, sure.
This is part of "almost nothing" in my count ;-)

Ordered comparisons should trap both with and without -ftrapping-math.
The difference is that with -fno-trapping math GCC can ignore that and
just optimise code how it wants to.


Segher


Re: Reverting r278411

2019-11-20 Thread Richard Sandiford
Richard Sandiford  writes:
>>> I'd actually considered converting to signed and back instead of adding
>>> extra cases, but I thought that would be rejected as too inefficient.
>>> (That was a concern with my patch above.)  It seemed like one of the selling
>>> points of doing it your way was that everything was handled by one switch
>>> statement "in" and one switch statement "out", and I was trying to
>>> preserve that.
>>> 
>>> signed_condition and unsigned_condition assert on unordered comparisons,
>>> so if we're going to go that route, we either need to filter them out
>>> first or add maybe_* versions of the routines that return UNKNOWN.
>>
>> Yeah.  rs6000 has
>>
>> (define_predicate "unsigned_comparison_operator"
>>   (match_code "ltu,gtu,leu,geu"))
>> (define_predicate "signed_comparison_operator"
>>   (match_code "lt,gt,le,ge"))
>>
>> Maybe we should have those be for every target?
>>
>>   bool is_signed = (signed_comparison_operator (code0)
>> || signed_comparison_operator (code1));
>>   bool is_unsigned = (unsigned_comparison_operator (code0)
>>   || unsigned_comparison_operator (code1));
>>
>>   /* Don't allow mixing signed and unsigned comparisons.  */
>>   if (is_signed && is_unsigned)
>> return 0;
>>
>> (this takes care of your EQ/NE concern automatically, btw)
>>
>>   if (unsigned_comparison_operator (code0))
>> code0 = signed_condition (code0);
>>   if (unsigned_comparison_operator (code1))
>> code1 = signed_condition (code1);
>>
>> and at the end
>>
>>   if (is_unsigned && signed_comparison_operator (code))
>> code = unsigned_condition (code);
>
> OK, thanks, I'll do it this way.

Actually, this doesn't work because *_operators want rtxes rather
than codes.  I can get around that by passing op0 and op1 for
the existing rtxes.  For the conversion at the end, I can do:

  machine_mode compared_mode = GET_MODE (XEXP (op0, 0));

  if (code == ORDERED && INTEGRAL_MODE_P (compared_mode))
return const_true_rtx;

  if (is_unsigned)
code = unsigned_condition (code);

Or I can add signed_comparison_p and unsigned_comparison_p functions
that take codes instead of rtxes.

Do either of those sound OK, or would you prefer something else?

Richard


Re: Reverting r278411

2019-11-20 Thread Jakub Jelinek
On Wed, Nov 20, 2019 at 10:17:38AM -0600, Segher Boessenkool wrote:
> On Wed, Nov 20, 2019 at 03:46:29PM +, Richard Sandiford wrote:
> > Segher Boessenkool  writes:
> > > UNLT & ORDERED is always LT.  When would it not be true?
> > 
> > LT traps on quiet NaNs for -ftrapping-math, UNLT and ORDERED don't.
> 
> No?  -ftrapping-math makes nothing trap.  The only thing it does is to
> not do optimisations that are not valid if traps are considered to be
> a user-visible thing.
> 
> Almost nothing ever traps on quiet NaNs.

A lot traps even with quiet NaNs, assuming exceptions are enabled.
E.g. for x86 https://www.felixcloutier.com/x86/cmppd#tbl-3-1 lists the
details which operations rise FE_INVALID and which don't if it is enabled.

Jakub



Re: Reverting r278411

2019-11-20 Thread Segher Boessenkool
On Wed, Nov 20, 2019 at 03:46:29PM +, Richard Sandiford wrote:
> Segher Boessenkool  writes:
> > UNLT & ORDERED is always LT.  When would it not be true?
> 
> LT traps on quiet NaNs for -ftrapping-math, UNLT and ORDERED don't.

No?  -ftrapping-math makes nothing trap.  The only thing it does is to
not do optimisations that are not valid if traps are considered to be
a user-visible thing.

Almost nothing ever traps on quiet NaNs.

And indeed the existing code is wrong for this already, as I mentioned
before (in the context of IEEE SNaNs).  But yeah, we can do almost no
optimisation if trapping math is true, so we should skip everything in
that case.  I wonder how much of existing GCC code breaks this as well
though, hrm.


Segher


Re: Reverting r278411

2019-11-20 Thread Richard Sandiford
Segher Boessenkool  writes:
> Hi!
>
> On Wed, Nov 20, 2019 at 09:42:46AM +, Richard Sandiford wrote:
>> Segher Boessenkool  writes:
>> >> Before I resubmit, why is the simplify-rtx.c part all wrong?
>> >
>> > It was nice and simple, and it isn't anymore.  8 4 2 1 for the four of
>> > lt gt eq un are hardly worth documenting or making symbolic constants
>> > for, because a) they are familiar to all powerpc users anyway (the
>> > assembler has those as predefined constants!), admittedly this isn't a
>> > strong argument for most people;
>> 
>> Ah, OK.  I was totally unaware of the connection.
>
> Yeah, I should have documented it.  Time pressure was high the last N
> weeks, with everyone else putting stuff in before end of stage 1 things
> broke left and right for me.  Normally I would just not update trunk the
> last two months or so of stage 1 (for development -- for testing you
> always need current trunk of course), but I needed some stuff from it.
> Oh well, I finally dug myself out.
>
>> > but also b) they were only used in two short and obvious routines.
>> > After your patch the routines are no longer short or obvious.
>> >
>> > A comparison result is exactly one of four things: lt, gt, eq, or un.
>> > So we can express testing for any subset of those with just an OR of
>> > the masks for the individual conditions.  Whether a comparison is
>> > floating point, floating point no-nans, signed, or unsigned, is just
>> > a property of the comparison, not of the result, in this representation.
>> 
>> Yeah, this works for OR because we never lose the unorderdness.
>
> It works for everything, including things with a NOT.
>
> If the mode does not allow unordered, you mask that away all the way at
> the end, and nothing else needs to be done.  This doesn't have to be done
> for IOR because you can never end up with unordered in the mask if you
> didn't start out with some code that allows unordered.
>
> You also can encode NE as not allowing unordered, if you have a mode
> where that does not exist, but that is purely cosmetic.
>
> 8421  "full" "no-nan"
>   false  false
> 1000  lt lt
> 0100  gt gt
> 1100  ltgt   ne
> 0010  eq eq
> 1010  le le
> 0110  ge ge
> 1110  orderedtrue
> 0001  unordered
> 1001  unlt
> 0101  ungt
> 1101  ne
> 0011  uneq
> 1011  unle
> 0111  unge
>   true
>
> So for !HONOR_NANS modes we need to output LTGT as NE, and ORDERED as
> true, and that is all.
>
>> There were two aspects to my patch.  One was adding AND, and had:
>> 
>>   /* We only handle AND if we can ignore unordered cases.  */
>>   bool honor_nans_p = HONOR_NANS (GET_MODE (op0));
>>   if (code != IOR && (code != AND || honor_nans_p))
>> return 0;
>> 
>> This is needed because e.g. UNLT & ORDERED -> LT is only conditionally
>> valid.  It doesn't sound like you're objecting to that bit, is that right?
>> Or was this what you had in mind with the reference to no-nans?
>
> UNLT & ORDERED is always LT.  When would it not be true?

LT traps on quiet NaNs for -ftrapping-math, UNLT and ORDERED don't.

>> As mentioned in the covering note, I punted for this because handling
>> trapping FP comparisons correctly for AND would need its own set of
>> testcases.
>
> This isn't trapping arithmetic.  Unordered is a perfectly normal result.
> As IEEE 754 says:
>   Four mutually exclusive relations are possible: less than, equal,
>   greater than, and unordered.
> This same is true when !HONOR_NANS, for signed integer comparisons, and
> for unsigned integer comparisons: just UNORDERED never happens.
>
>> > If you do not mix signed and unsigned comparisons (they make not much
>> > sense mixed anyway(*)), you need no changes at all: just translate
>> > ltu/gtu/leu/geu to/from lt/gt/le/ge on the way in and out of this
>> > function (there already are helper functions for that, signed_condition
>> > and unsigned_condition).
>> 
>> So this all seems to come down to whether unsigned comparisons are handled
>> as separate mask bits or whether they're dealt with by removing the
>> unsignedness and then adding it back.  ISTM both are legitimate ways
>> of doing it.  I don't think one of them is "all wrong".
>
> It violates the whole design of the thing left and right.  I never
> documented that well (or at all), of course :-/
>
>> I was very belatedly getting around to dealing with Joseph's comment
>> when you sent your patch and had it approved.  Since that patch seemed
>> to be more favourably received in general, I was trying to work within
>> the existing style of your version.  And without the powerpc background,
>> I just assumed that the mask values were "documented" by the first block
>> of case statements:
>> 
>> case LT:
>>   return 8;
>> case GT:
>>   return 4;
>> case EQ:
>>   return 2;
>> case UNORDERED:
>>   return 1;
>
> Yeah, but it may not be obvious that exactly one of those is true for
> any comparison, and you can combine them t

Re: Reverting r278411

2019-11-20 Thread Segher Boessenkool
Hi!

On Wed, Nov 20, 2019 at 09:42:46AM +, Richard Sandiford wrote:
> Segher Boessenkool  writes:
> >> Before I resubmit, why is the simplify-rtx.c part all wrong?
> >
> > It was nice and simple, and it isn't anymore.  8 4 2 1 for the four of
> > lt gt eq un are hardly worth documenting or making symbolic constants
> > for, because a) they are familiar to all powerpc users anyway (the
> > assembler has those as predefined constants!), admittedly this isn't a
> > strong argument for most people;
> 
> Ah, OK.  I was totally unaware of the connection.

Yeah, I should have documented it.  Time pressure was high the last N
weeks, with everyone else putting stuff in before end of stage 1 things
broke left and right for me.  Normally I would just not update trunk the
last two months or so of stage 1 (for development -- for testing you
always need current trunk of course), but I needed some stuff from it.
Oh well, I finally dug myself out.

> > but also b) they were only used in two short and obvious routines.
> > After your patch the routines are no longer short or obvious.
> >
> > A comparison result is exactly one of four things: lt, gt, eq, or un.
> > So we can express testing for any subset of those with just an OR of
> > the masks for the individual conditions.  Whether a comparison is
> > floating point, floating point no-nans, signed, or unsigned, is just
> > a property of the comparison, not of the result, in this representation.
> 
> Yeah, this works for OR because we never lose the unorderdness.

It works for everything, including things with a NOT.

If the mode does not allow unordered, you mask that away all the way at
the end, and nothing else needs to be done.  This doesn't have to be done
for IOR because you can never end up with unordered in the mask if you
didn't start out with some code that allows unordered.

You also can encode NE as not allowing unordered, if you have a mode
where that does not exist, but that is purely cosmetic.

8421  "full" "no-nan"
  false  false
1000  lt lt
0100  gt gt
1100  ltgt   ne
0010  eq eq
1010  le le
0110  ge ge
1110  orderedtrue
0001  unordered
1001  unlt
0101  ungt
1101  ne
0011  uneq
1011  unle
0111  unge
  true

So for !HONOR_NANS modes we need to output LTGT as NE, and ORDERED as
true, and that is all.

> There were two aspects to my patch.  One was adding AND, and had:
> 
>   /* We only handle AND if we can ignore unordered cases.  */
>   bool honor_nans_p = HONOR_NANS (GET_MODE (op0));
>   if (code != IOR && (code != AND || honor_nans_p))
> return 0;
> 
> This is needed because e.g. UNLT & ORDERED -> LT is only conditionally
> valid.  It doesn't sound like you're objecting to that bit, is that right?
> Or was this what you had in mind with the reference to no-nans?

UNLT & ORDERED is always LT.  When would it not be true?

> As mentioned in the covering note, I punted for this because handling
> trapping FP comparisons correctly for AND would need its own set of
> testcases.

This isn't trapping arithmetic.  Unordered is a perfectly normal result.
As IEEE 754 says:
  Four mutually exclusive relations are possible: less than, equal,
  greater than, and unordered.
This same is true when !HONOR_NANS, for signed integer comparisons, and
for unsigned integer comparisons: just UNORDERED never happens.

> > If you do not mix signed and unsigned comparisons (they make not much
> > sense mixed anyway(*)), you need no changes at all: just translate
> > ltu/gtu/leu/geu to/from lt/gt/le/ge on the way in and out of this
> > function (there already are helper functions for that, signed_condition
> > and unsigned_condition).
> 
> So this all seems to come down to whether unsigned comparisons are handled
> as separate mask bits or whether they're dealt with by removing the
> unsignedness and then adding it back.  ISTM both are legitimate ways
> of doing it.  I don't think one of them is "all wrong".

It violates the whole design of the thing left and right.  I never
documented that well (or at all), of course :-/

> I was very belatedly getting around to dealing with Joseph's comment
> when you sent your patch and had it approved.  Since that patch seemed
> to be more favourably received in general, I was trying to work within
> the existing style of your version.  And without the powerpc background,
> I just assumed that the mask values were "documented" by the first block
> of case statements:
> 
> case LT:
>   return 8;
> case GT:
>   return 4;
> case EQ:
>   return 2;
> case UNORDERED:
>   return 1;

Yeah, but it may not be obvious that exactly one of those is true for
any comparison, and you can combine them to a mask and do any logical
operations on it.

> Adding two more cases here didn't seem to make things any more unclear.
> But maybe it is more jarring if you've memorised the powerpc mapping.

It's also that 2**4 is small, but 2**6 is not.  The 2**4 cases all

Re: Reverting r278411

2019-11-20 Thread Richard Sandiford
Richard Sandiford  writes:
> Segher Boessenkool  writes:
>>> find_sets_in_insn has:
>>> 
>>>   /* Don't count call-insns, (set (reg 0) (call ...)), as a set.
>>>  The hard function value register is used only once, to copy to
>>>  someplace else, so it isn't worth cse'ing.  */
>>>   else if (GET_CODE (SET_SRC (x)) == CALL)
>>> ;
>>> 
>>> So n_sets == 1 can be true if we have a single SET in parallel
>>> with a call.  Unsurprising, I guess no-one had MEM sets in
>>> parallel with a call, so it didn't matter until now.
>>> 
>>> I'm retesting with a !CALL_P check added to make sure that was
>>> the only problem.
>>> 
>>> Before I resubmit, why is the simplify-rtx.c part all wrong?
>>
>> It was nice and simple, and it isn't anymore.  8 4 2 1 for the four of
>> lt gt eq un are hardly worth documenting or making symbolic constants
>> for, because a) they are familiar to all powerpc users anyway (the
>> assembler has those as predefined constants!), admittedly this isn't a
>> strong argument for most people;
>
> Ah, OK.  I was totally unaware of the connection.
>
>> but also b) they were only used in two short and obvious routines.
>> After your patch the routines are no longer short or obvious.
>>
>> A comparison result is exactly one of four things: lt, gt, eq, or un.
>> So we can express testing for any subset of those with just an OR of
>> the masks for the individual conditions.  Whether a comparison is
>> floating point, floating point no-nans, signed, or unsigned, is just
>> a property of the comparison, not of the result, in this representation.
>
> Yeah, this works for OR because we never lose the unorderdness.
>
> There were two aspects to my patch.  One was adding AND, and had:
>
>   /* We only handle AND if we can ignore unordered cases.  */
>   bool honor_nans_p = HONOR_NANS (GET_MODE (op0));
>   if (code != IOR && (code != AND || honor_nans_p))
> return 0;
>
> This is needed because e.g. UNLT & ORDERED -> LT is only conditionally
> valid.  It doesn't sound like you're objecting to that bit, is that right?
> Or was this what you had in mind with the reference to no-nans?
>
> As mentioned in the covering note, I punted for this because handling
> trapping FP comparisons correctly for AND would need its own set of
> testcases.  The current code punts for AND and for unsigned comparisons,
> so continuing to punt for ANDs on this case seemed fair game.
>
>> If you do not mix signed and unsigned comparisons (they make not much
>> sense mixed anyway(*)), you need no changes at all: just translate
>> ltu/gtu/leu/geu to/from lt/gt/le/ge on the way in and out of this
>> function (there already are helper functions for that, signed_condition
>> and unsigned_condition).
>
> So this all seems to come down to whether unsigned comparisons are handled
> as separate mask bits or whether they're dealt with by removing the
> unsignedness and then adding it back.  ISTM both are legitimate ways
> of doing it.  I don't think one of them is "all wrong".
>
> FWIW, I'd posted a patch to do the IOR/AND thing in July:
>
>   https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00933.html
>
> But it turns out I'd got the signalling NaN behaviour wrong:
>
>   https://gcc.gnu.org/ml/gcc-patches/2019-08/msg01006.html
>
> (hence punting above :-)).
>
> I was very belatedly getting around to dealing with Joseph's comment
> when you sent your patch and had it approved.  Since that patch seemed
> to be more favourably received in general, I was trying to work within
> the existing style of your version.  And without the powerpc background,
> I just assumed that the mask values were "documented" by the first block
> of case statements:
>
> case LT:
>   return 8;
> case GT:
>   return 4;
> case EQ:
>   return 2;
> case UNORDERED:
>   return 1;
>
> Adding two more cases here didn't seem to make things any more unclear.
> But maybe it is more jarring if you've memorised the powerpc mapping.
>
> I'd actually considered converting to signed and back instead of adding
> extra cases, but I thought that would be rejected as too inefficient.
> (That was a concern with my patch above.)  It seemed like one of the selling
> points of doing it your way was that everything was handled by one switch
> statement "in" and one switch statement "out", and I was trying to
> preserve that.
>
> signed_condition and unsigned_condition assert on unordered comparisons,
> so if we're going to go that route, we either need to filter them out
> first or add maybe_* versions of the routines that return UNKNOWN.
> We also have to deal with the neutrality of EQ and NE.  E.g. something
> like:
>
>   rtx_code signed_code0 = maybe_signed_condition (code0);
>   rtx_code signed_code1 = maybe_signed_condition (code1);
>   if (signed_code0 != UNKNOWN && signed_code1 != UNKNOWN)
> code0 = signed_code0, code1 = signed_code1;
>   else if (signed_code0 != UNKNOWN || signed_code1 != UNKNOWN)
> return 0;
>
>   ...
>
>   if

Re: Reverting r278411

2019-11-20 Thread Richard Sandiford
Segher Boessenkool  writes:
>> find_sets_in_insn has:
>> 
>>   /* Don't count call-insns, (set (reg 0) (call ...)), as a set.
>>   The hard function value register is used only once, to copy to
>>   someplace else, so it isn't worth cse'ing.  */
>>   else if (GET_CODE (SET_SRC (x)) == CALL)
>>  ;
>> 
>> So n_sets == 1 can be true if we have a single SET in parallel
>> with a call.  Unsurprising, I guess no-one had MEM sets in
>> parallel with a call, so it didn't matter until now.
>> 
>> I'm retesting with a !CALL_P check added to make sure that was
>> the only problem.
>> 
>> Before I resubmit, why is the simplify-rtx.c part all wrong?
>
> It was nice and simple, and it isn't anymore.  8 4 2 1 for the four of
> lt gt eq un are hardly worth documenting or making symbolic constants
> for, because a) they are familiar to all powerpc users anyway (the
> assembler has those as predefined constants!), admittedly this isn't a
> strong argument for most people;

Ah, OK.  I was totally unaware of the connection.

> but also b) they were only used in two short and obvious routines.
> After your patch the routines are no longer short or obvious.
>
> A comparison result is exactly one of four things: lt, gt, eq, or un.
> So we can express testing for any subset of those with just an OR of
> the masks for the individual conditions.  Whether a comparison is
> floating point, floating point no-nans, signed, or unsigned, is just
> a property of the comparison, not of the result, in this representation.

Yeah, this works for OR because we never lose the unorderdness.

There were two aspects to my patch.  One was adding AND, and had:

  /* We only handle AND if we can ignore unordered cases.  */
  bool honor_nans_p = HONOR_NANS (GET_MODE (op0));
  if (code != IOR && (code != AND || honor_nans_p))
return 0;

This is needed because e.g. UNLT & ORDERED -> LT is only conditionally
valid.  It doesn't sound like you're objecting to that bit, is that right?
Or was this what you had in mind with the reference to no-nans?

As mentioned in the covering note, I punted for this because handling
trapping FP comparisons correctly for AND would need its own set of
testcases.  The current code punts for AND and for unsigned comparisons,
so continuing to punt for ANDs on this case seemed fair game.

> If you do not mix signed and unsigned comparisons (they make not much
> sense mixed anyway(*)), you need no changes at all: just translate
> ltu/gtu/leu/geu to/from lt/gt/le/ge on the way in and out of this
> function (there already are helper functions for that, signed_condition
> and unsigned_condition).

So this all seems to come down to whether unsigned comparisons are handled
as separate mask bits or whether they're dealt with by removing the
unsignedness and then adding it back.  ISTM both are legitimate ways
of doing it.  I don't think one of them is "all wrong".

FWIW, I'd posted a patch to do the IOR/AND thing in July:

  https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00933.html

But it turns out I'd got the signalling NaN behaviour wrong:

  https://gcc.gnu.org/ml/gcc-patches/2019-08/msg01006.html

(hence punting above :-)).

I was very belatedly getting around to dealing with Joseph's comment
when you sent your patch and had it approved.  Since that patch seemed
to be more favourably received in general, I was trying to work within
the existing style of your version.  And without the powerpc background,
I just assumed that the mask values were "documented" by the first block
of case statements:

case LT:
  return 8;
case GT:
  return 4;
case EQ:
  return 2;
case UNORDERED:
  return 1;

Adding two more cases here didn't seem to make things any more unclear.
But maybe it is more jarring if you've memorised the powerpc mapping.

I'd actually considered converting to signed and back instead of adding
extra cases, but I thought that would be rejected as too inefficient.
(That was a concern with my patch above.)  It seemed like one of the selling
points of doing it your way was that everything was handled by one switch
statement "in" and one switch statement "out", and I was trying to
preserve that.

signed_condition and unsigned_condition assert on unordered comparisons,
so if we're going to go that route, we either need to filter them out
first or add maybe_* versions of the routines that return UNKNOWN.
We also have to deal with the neutrality of EQ and NE.  E.g. something
like:

  rtx_code signed_code0 = maybe_signed_condition (code0);
  rtx_code signed_code1 = maybe_signed_condition (code1);
  if (signed_code0 != UNKNOWN && signed_code1 != UNKNOWN)
code0 = signed_code0, code1 = signed_code1;
  else if (signed_code0 != UNKNOWN || signed_code1 != UNKNOWN)
return 0;

  ...

  if (signed_code0 == UNKNOWN)
code = unsigned_condition (code);

Or (without the maybe_* versions) something like:

  bool unsigned0_p
= (code0 == LTU || code0 == GTU || code0 == LEU || code0 ==

Re: Reverting r278411

2019-11-19 Thread Segher Boessenkool
Hi Richard,

On Tue, Nov 19, 2019 at 07:35:10PM +, Richard Sandiford wrote:
> Segher Boessenkool  writes:
> > r278411 causes bootstrap to fail on at least all powerpc*.  Also, I am
> > author of this simplify-rtx code, and I think what you did is all wrong.
> > Also, it should be two patches, the CSE part should be separate.  (I can
> > not tell if that is the part that regresses us, either).
> 
> I committed them as one patch because I'd tested them together,
> which I thought was the way this should be done.

Ideally you would test separate patches separately, too :-)

> In practice
> a clean revert would have to be done as a pair anyway: reverting
> one part individually would have required XFAILs on the tests.
> 
> And yeah, it was the cse.c part that was the problem.

Thanks for finding the problem so quickly, much appreciated.

> find_sets_in_insn has:
> 
>   /* Don't count call-insns, (set (reg 0) (call ...)), as a set.
>The hard function value register is used only once, to copy to
>someplace else, so it isn't worth cse'ing.  */
>   else if (GET_CODE (SET_SRC (x)) == CALL)
>   ;
> 
> So n_sets == 1 can be true if we have a single SET in parallel
> with a call.  Unsurprising, I guess no-one had MEM sets in
> parallel with a call, so it didn't matter until now.
> 
> I'm retesting with a !CALL_P check added to make sure that was
> the only problem.
> 
> Before I resubmit, why is the simplify-rtx.c part all wrong?

It was nice and simple, and it isn't anymore.  8 4 2 1 for the four of
lt gt eq un are hardly worth documenting or making symbolic constants
for, because a) they are familiar to all powerpc users anyway (the
assembler has those as predefined constants!), admittedly this isn't a
strong argument for most people; but also b) they were only used in two
short and obvious routines.  After your patch the routines are no
longer short or obvious.

A comparison result is exactly one of four things: lt, gt, eq, or un.
So we can express testing for any subset of those with just an OR of
the masks for the individual conditions.  Whether a comparison is
floating point, floating point no-nans, signed, or unsigned, is just
a property of the comparison, not of the result, in this representation.

If you do not mix signed and unsigned comparisons (they make not much
sense mixed anyway(*)), you need no changes at all: just translate
ltu/gtu/leu/geu to/from lt/gt/le/ge on the way in and out of this
function (there already are helper functions for that, signed_condition
and unsigned_condition).


Segher

(*) lt(a,b) & ltu(a,b) means the signs of a and b are equal, and a

Re: Reverting r278411

2019-11-19 Thread Richard Sandiford
Segher Boessenkool  writes:
> Hi,
>
> r278411 causes bootstrap to fail on at least all powerpc*.  Also, I am
> author of this simplify-rtx code, and I think what you did is all wrong.
> Also, it should be two patches, the CSE part should be separate.  (I can
> not tell if that is the part that regresses us, either).

I committed them as one patch because I'd tested them together,
which I thought was the way this should be done.  In practice
a clean revert would have to be done as a pair anyway: reverting
one part individually would have required XFAILs on the tests.

And yeah, it was the cse.c part that was the problem.
find_sets_in_insn has:

  /* Don't count call-insns, (set (reg 0) (call ...)), as a set.
 The hard function value register is used only once, to copy to
 someplace else, so it isn't worth cse'ing.  */
  else if (GET_CODE (SET_SRC (x)) == CALL)
;

So n_sets == 1 can be true if we have a single SET in parallel
with a call.  Unsurprising, I guess no-one had MEM sets in
parallel with a call, so it didn't matter until now.

I'm retesting with a !CALL_P check added to make sure that was
the only problem.

Before I resubmit, why is the simplify-rtx.c part all wrong?

Thanks,
Richard


Re: Reverting r278411

2019-11-19 Thread Segher Boessenkool
On Tue, Nov 19, 2019 at 04:32:09PM +, Richard Sandiford wrote:
> Segher Boessenkool  writes:
> > r278411 causes bootstrap to fail on at least all powerpc*.  Also, I am
> > author of this simplify-rtx code, and I think what you did is all wrong.
> > Also, it should be two patches, the CSE part should be separate.  (I can
> > not tell if that is the part that regresses us, either).
> >
> > Could you please revert the patch?  I'll re-implement the simplify-rtx
> > part of it properly, if you want.  And I can test the CSE part separately.
> > But right now we need trunk to work again.
> 
> OK, done as r278455.

Thanks!


Segher


Re: Reverting r278411

2019-11-19 Thread Richard Sandiford
Segher Boessenkool  writes:
> Hi,
>
> r278411 causes bootstrap to fail on at least all powerpc*.  Also, I am
> author of this simplify-rtx code, and I think what you did is all wrong.
> Also, it should be two patches, the CSE part should be separate.  (I can
> not tell if that is the part that regresses us, either).
>
> Could you please revert the patch?  I'll re-implement the simplify-rtx
> part of it properly, if you want.  And I can test the CSE part separately.
> But right now we need trunk to work again.

OK, done as r278455.

Richard