Re: [PATCH][PPC] Fix ICE using power9 with soft-float

2016-11-23 Thread Michael Meissner
On Wed, Nov 23, 2016 at 11:52:01AM +, Andrew Stubbs wrote:
> On 16/11/16 17:05, Michael Meissner wrote:
> >I'm starting to test this patch right now (it's on LE power8 stage3 right 
> >now,
> >and I need to build BE power8 and BE power7 versions when I get into the 
> >office
> >shortly, and build spec 2017 with it for PR 78101):
> 
> Did the testing go OK?

Yes, the testing worked fine, and I checked in the fix.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH][PPC] Fix ICE using power9 with soft-float

2016-11-23 Thread Andrew Stubbs

On 16/11/16 17:05, Michael Meissner wrote:

I'm starting to test this patch right now (it's on LE power8 stage3 right now,
and I need to build BE power8 and BE power7 versions when I get into the office
shortly, and build spec 2017 with it for PR 78101):


Did the testing go OK?

Andrew



Re: [PATCH][PPC] Fix ICE using power9 with soft-float

2016-11-16 Thread Michael Meissner
On Wed, Nov 16, 2016 at 04:15:10PM +, Andrew Stubbs wrote:
> On 16/11/16 13:10, Michael Meissner wrote:
> >Yeah, SFmode and DFmode should not have the TARGET_{S,D}F_FPR checks.
> 
> So, I can safely resolve my initial problem by simply removing them?
> And that wouldn't break the other use of that predicate?
> 
> >But a secondary problem is the early clobber in the match_scratch.
> 
> So, the FPR_FUSION insn works because operands 1 and 2 cannot
> conflict, which means the early-clobber is not necessary, but the
> GPR_FUSION insn cannot work because there's no way to ensure that
> operands 1 and 2 don't conflict without also specifying that
> operands 0 and 2 don't conflict, which they commonly do.
> 
> We could fix it, for now, by adding new patterns that fit both cases
> (given that the register numbers are known at peephole time).
> 
> Or, we could disable the peephole in the case where this would occur
> (as my original patch does, albeit bluntly).

I'm starting to test this patch right now (it's on LE power8 stage3 right now,
and I need to build BE power8 and BE power7 versions when I get into the office
shortly, and build spec 2017 with it for PR 78101):

[gcc]
2016-11-16  Michael Meissner  

PR target/78101
* config/rs6000/predicates.md (fusion_addis_mem_combo_load): Add
the appropriate checks for SFmode/DFmode load/stores in GPR
registers.
(fusion_addis_mem_combo_store): Likewise.
* config/rs6000/rs6000.c (rs6000_init_hard_regno_mode_ok): Rename
fusion_fpr_* to fusion_vsx_* and add in support for ISA 3.0 scalar
d-form instructions for traditional Altivec registers.
(emit_fusion_p9_load): Likewise.
(emit_fusion_p9_store): Likewise.
* config/rs6000/rs6000.md (p9 fusion store peephole2): Remove
early clobber from scratch register.  Do not match if the register
being stored is the scratch register.
(fusion_vsx___load): Rename fusion_fpr_*
to fusion_vsx_* and add in support for ISA 3.0 scalar d-form
instructions for traditional Altivec registers.
(fusion_fpr___load): Likewise.
(fusion_vsx___store): Likewise.
(fusion_fpr___store): Likewise.

[gcc/testsuite]
2016-11-16  Michael Meissner  

PR target/78101
* gcc.target/powerpc/fusion4.c: New test.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md 
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)
(revision 242456)
+++ gcc/config/rs6000/predicates.md (.../gcc/config/rs6000) (working copy)
@@ -1844,7 +1844,7 @@ (define_predicate "fusion_gpr_mem_load"
 ;; Match a GPR load (lbz, lhz, lwz, ld) that uses a combined address in the
 ;; memory field with both the addis and the memory offset.  Sign extension
 ;; is not handled here, since lha and lwa are not fused.
-;; With extended fusion, also match a FPR load (lfd, lfs) and float_extend
+;; With P9 fusion, also match a fpr/vector load and float_extend
 (define_predicate "fusion_addis_mem_combo_load"
   (match_code "mem,zero_extend,float_extend")
 {
@@ -1873,11 +1873,15 @@ (define_predicate "fusion_addis_mem_comb
   break;
 
 case SFmode:
-case DFmode:
   if (!TARGET_P9_FUSION)
return 0;
   break;
 
+case DFmode:
+  if ((!TARGET_POWERPC64 && !TARGET_DF_FPR) || !TARGET_P9_FUSION)
+   return 0;
+  break;
+
 default:
   return 0;
 }
@@ -1920,6 +1924,7 @@ (define_predicate "fusion_addis_mem_comb
 case QImode:
 case HImode:
 case SImode:
+case SFmode:
   break;
 
 case DImode:
@@ -1927,13 +1932,8 @@ (define_predicate "fusion_addis_mem_comb
return 0;
   break;
 
-case SFmode:
-  if (!TARGET_SF_FPR)
-   return 0;
-  break;
-
 case DFmode:
-  if (!TARGET_DF_FPR)
+  if (!TARGET_POWERPC64 && !TARGET_DF_FPR)
return 0;
   break;
 
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)
(revision 242456)
+++ gcc/config/rs6000/rs6000.c  (.../gcc/config/rs6000) (working copy)
@@ -3441,28 +3441,28 @@ rs6000_init_hard_regno_mode_ok (bool glo
 
   static const struct fuse_insns addis_insns[] = {
{ SFmode, DImode, RELOAD_REG_FPR,
- CODE_FOR_fusion_fpr_di_sf_load,
- CODE_FOR_fusion_fpr_di_sf_store },
+ CODE_FOR_fusion_vsx_di_sf_load,
+ CODE_FOR_fusion_vsx_di_sf_store },
 
{ SFmode, SImode, RELOAD_REG_FPR,
- CODE_FOR_fusion_fpr_si_sf_load,
- CODE_FOR_fusion_fpr_si_sf_store },
+ 

Re: [PATCH][PPC] Fix ICE using power9 with soft-float

2016-11-16 Thread Andrew Stubbs

On 16/11/16 13:10, Michael Meissner wrote:

Yeah, SFmode and DFmode should not have the TARGET_{S,D}F_FPR checks.


So, I can safely resolve my initial problem by simply removing them? And 
that wouldn't break the other use of that predicate?



But a secondary problem is the early clobber in the match_scratch.


So, the FPR_FUSION insn works because operands 1 and 2 cannot conflict, 
which means the early-clobber is not necessary, but the GPR_FUSION insn 
cannot work because there's no way to ensure that operands 1 and 2 don't 
conflict without also specifying that operands 0 and 2 don't conflict, 
which they commonly do.


We could fix it, for now, by adding new patterns that fit both cases 
(given that the register numbers are known at peephole time).


Or, we could disable the peephole in the case where this would occur (as 
my original patch does, albeit bluntly).


Or, something else?

Andrew


Re: [PATCH][PPC] Fix ICE using power9 with soft-float

2016-11-16 Thread Michael Meissner
On Tue, Nov 15, 2016 at 09:11:56PM +, Andrew Stubbs wrote:
> On 15/11/16 21:06, Michael Meissner wrote:
> >Now, that I have a little time, I can look into this, to at least make
> >predicate and peepholes match.  There is some other stuff (support for the 
> >new
> >load/store that were added to the compiler after that we should also tackle).
> 
> I've been investigating this today, and I've found that the insn
> does not match because the "fusion_addis_mem_combo_store" predicate
> requires TARGET_SF_FPR is true, which in turn requires
> TARGET_HARD_FLOAT is true.
> 
> So basically the fusion stuff is disabled in soft-float mode
> regardless of where the value is stored.
> 
> Anyway, I'm at end-of-day now, so let me know if you come up with anything.

Yeah, SFmode and DFmode should not have the TARGET_{S,D}F_FPR checks.

But a secondary problem is the early clobber in the match_scratch.

Before the peephole2 we have:

(insn 7 4 8 2 (set (reg/f:DI 3 3 [157])
(plus:DI (reg:DI 3 3 [ p ])
(const_int 327680 [0x5]))) "fusion3.c":12 75 {*adddi3}
 (nil))

(insn 8 7 14 2 (set (mem:SF (plus:DI (reg/f:DI 3 3 [157])
(const_int -29420 [0x8d14])) [1 MEM[(float *)p_1(D) 
+ 298260B]+0 S4 A32])
(reg:SF 4 4 [ f ])) "fusion3.c":12 484 {*movsf_softfloat}
 (expr_list:REG_DEAD (reg:SF 4 4 [ f ])
(expr_list:REG_DEAD (reg/f:DI 3 3 [157])
(nil

This gets combined into:

(insn 17 4 14 2 (parallel [
(set (mem:SF (plus:DI (plus:DI (reg:DI 3 3 [ p ])
(const_int 327680 [0x5]))
(const_int -29420 [0x8d14])) [1 MEM[(float 
*)p_1(D) + 298260B]+0 S4 A32])
(unspec:SF [
(reg:SF 4 4 [ f ])
] UNSPEC_FUSION_P9))
(clobber (reg/f:DI 3 3 [157]))
]) "fusion3.c":12 -1
 (nil))

where (reg:DI 3) is the base register temporary.  Since it was used as part of
the original address, the early clobber check in constrain_operands will flag
it.  This is what is causing PR 78101.  Thus there are two issues here.

In the scope of the ISA 3.0/power9 work, there is also the issue that the new
d-form load and stores should be folded into this as well.  This is due to me
doing the original power9/ISA 3.0 fusion first, and then putting it on the
shelf, and then months later working on the new address forms, and never
getting back to fusion.

As it is currently written, there was an expectation that the fusion stuff
would eventually move from being a peephole2 to being part of the addressing
handled before register allocation.  Then the secondary reload hook would call
the appropriate fusion helper function, which is why we have int_ and gpr_
fusion insn functions, each based on a separate register class.  The int_
fusion form allows SFmode/DFmode, but as you point out the combo recognizer
doesn't allow them in GPRs.

I'm not as sure that we will have the will to move fused addresses early before
register allocation, and whether we need to keep around the separate int_ and
fpr_ forms.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH][PPC] Fix ICE using power9 with soft-float

2016-11-15 Thread Andrew Stubbs

On 15/11/16 21:06, Michael Meissner wrote:

Now, that I have a little time, I can look into this, to at least make
predicate and peepholes match.  There is some other stuff (support for the new
load/store that were added to the compiler after that we should also tackle).


I've been investigating this today, and I've found that the insn does 
not match because the "fusion_addis_mem_combo_store" predicate requires 
TARGET_SF_FPR is true, which in turn requires TARGET_HARD_FLOAT is true.


So basically the fusion stuff is disabled in soft-float mode regardless 
of where the value is stored.


Anyway, I'm at end-of-day now, so let me know if you come up with anything.

Thanks

Andrew


Re: [PATCH][PPC] Fix ICE using power9 with soft-float

2016-11-15 Thread Michael Meissner
On Mon, Nov 14, 2016 at 04:57:58PM +, Andrew Stubbs wrote:
> The testcase powerpc/fusion3.c causes an ICE when compiled with
> -msoft-float.
> 
> The key line in the testcase looks fairly harmless:
> 
>void fusion_float_write (float *p, float f){ p[LARGE] = f; }

LARGE is large enough that it won't fit in the offset field of a single
instruction.

> The error message look like this:
> 
> .../gcc.target/powerpc/fusion3.c: In function 'fusion_float_write':
> .../gcc.target/powerpc/fusion3.c:12:1: error: unrecognizable insn:
> (insn 18 4 14 2 (parallel [
> (set (mem:SF (plus:SI (plus:SI (reg:SI 3 3 [ p ])
> (const_int 327680 [0x5]))
> (const_int -29420 [0x8d14])) [1
> MEM[(float *)p_1(D) + 298260B]+0 S4 A32])
> (unspec:SF [
> (reg:SF 4 4 [ f ])
> ] UNSPEC_FUSION_P9))
> (clobber (reg/f:SI 3 3 [157]))
> ]) 
> "/scratch/astubbs/fsf/src/gcc-mainline/gcc/testsuite/gcc.target/powerpc/fusion3.c":12
> -1
>  (nil))

When I wrote the basic fusion stuff, I was assuming nobody would do
-msoft-float -mcpu=power7.  By the time the code had been written, the
soft-float libraries were no longer being built on the 64-bit Linux systems,
due to the Linux distributions dropping support for them.

However, while we can make this particular failure go away by making
powerpc_p9vector_ok (and probably some of the other targets needing VSX
features) false if -msoft-float it is still a problem, since SFmode can go in
GPRs.

This is the same basic failure (PR 78101) that I saw in building the next
generation of the Spec benchmark suite, except that it is a DFmode instead of
SFmode.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78101

Both are trying to store a value from a GPR


> Basically, the problem is that the peephole optimization tries to
> create a Power9 Fusion instruction, but those do not support SF
> values in integer registers (AFAICT).
> 
> So, presumably, I need to adjust either the predicate or the
> condition of the peephole rules.

Now, that I have a little time, I can look into this, to at least make
predicate and peepholes match.  There is some other stuff (support for the new
load/store that were added to the compiler after that we should also tackle).

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH][PPC] Fix ICE using power9 with soft-float

2016-11-15 Thread Andrew Stubbs

On 15/11/16 12:29, Segher Boessenkool wrote:

The peepholes do not support it, or maybe the define_insns do not either.
The machine of course will not care.


Oh, OK, so probably the bug is not in the peephole at all, but in the 
define_insn, or lack thereof.


More investigation required.

Thanks

Andrew


Re: [PATCH][PPC] Fix ICE using power9 with soft-float

2016-11-15 Thread Segher Boessenkool
Hi Andrew,

Thanks for the patch and looking into this.

On Mon, Nov 14, 2016 at 04:57:58PM +, Andrew Stubbs wrote:
> The testcase powerpc/fusion3.c causes an ICE when compiled with 
> -msoft-float.

> Basically, the problem is that the peephole optimization tries to create 
> a Power9 Fusion instruction, but those do not support SF values in 
> integer registers (AFAICT).

The peepholes do not support it, or maybe the define_insns do not either.
The machine of course will not care.

> So, presumably, I need to adjust either the predicate or the condition 
> of the peephole rules.

Yes.

> The predicate used is "toc_fusion_or_p9_reg_operand", and this might be 
> the root cause, but I don't know the architecture well enough to be 
> sure.

This fusion is quite simple really.  Offset addressing insns can only
access a 16-bit range, but instead of writing e.g.

lwz 0,0x12345678(3)

you can do

addis 4,3,0x1234
lwz 0,0x5678(4)

and the processor will execute it faster, essentially like if it was
written as the first example.  See 2.1.1 in Power ISA 3.0.

> The predicate code seems to suggest that "toc_fusion", whatever 
> that is, should be able to do this, but the insn produced by the 
> peephole uses only UNSPEC_FUSION_P9, which does not. Perhaps this 
> predicate is inappropriate for the P9 Fusion peephole, or perhaps it 
> needs to be taught about this corner case?

One of those yes.

> In any case, I don't want to change the predicate without being sure 
> what it does (here and elsewhere), so the attached patch solves the 
> problem by changing the condition.
> 
> Is this OK, or do I need to do something less blunt?

We can have floats in GPRs even without TARGET_SOFT_FLOAT, so this
does not even work?  And yes this is blunt :-)


Segher


[PATCH][PPC] Fix ICE using power9 with soft-float

2016-11-14 Thread Andrew Stubbs
The testcase powerpc/fusion3.c causes an ICE when compiled with 
-msoft-float.


The key line in the testcase looks fairly harmless:

   void fusion_float_write (float *p, float f){ p[LARGE] = f; }

The error message look like this:

.../gcc.target/powerpc/fusion3.c: In function 'fusion_float_write':^M
.../gcc.target/powerpc/fusion3.c:12:1: error: unrecognizable insn:^M
(insn 18 4 14 2 (parallel [^M
(set (mem:SF (plus:SI (plus:SI (reg:SI 3 3 [ p ])^M
(const_int 327680 [0x5]))^M
(const_int -29420 [0x8d14])) [1 
MEM[(float *)p_1(D) + 298260B]+0 S4 A32])^M

(unspec:SF [^M
(reg:SF 4 4 [ f ])^M
] UNSPEC_FUSION_P9))^M
(clobber (reg/f:SI 3 3 [157]))^M
]) 
"/scratch/astubbs/fsf/src/gcc-mainline/gcc/testsuite/gcc.target/powerpc/fusion3.c":12 
-1^M

 (nil))^M

Basically, the problem is that the peephole optimization tries to create 
a Power9 Fusion instruction, but those do not support SF values in 
integer registers (AFAICT).


So, presumably, I need to adjust either the predicate or the condition 
of the peephole rules.


The predicate used is "toc_fusion_or_p9_reg_operand", and this might be 
the root cause, but I don't know the architecture well enough to be 
sure. The predicate code seems to suggest that "toc_fusion", whatever 
that is, should be able to do this, but the insn produced by the 
peephole uses only UNSPEC_FUSION_P9, which does not. Perhaps this 
predicate is inappropriate for the P9 Fusion peephole, or perhaps it 
needs to be taught about this corner case?


In any case, I don't want to change the predicate without being sure 
what it does (here and elsewhere), so the attached patch solves the 
problem by changing the condition.


Is this OK, or do I need to do something less blunt?

Thanks

Andrew
2016-11-11  Andrew Stubbs  

	gcc/
	* config/rs6000/rs6000.md: Disable P9-fusion peepholes when
	TARGET_SOFT_FLOAT is set.

	gcc/testsuite/
	* gcc.target/powerpc/fusion3.c: Skip on -msoft-float.

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index de959c9..28d8174 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -13024,7 +13024,8 @@
(set (match_operand:SFDF 2 "toc_fusion_or_p9_reg_operand" "")
 	(match_operand:SFDF 3 "fusion_offsettable_mem_operand" ""))]
   "TARGET_P9_FUSION && peep2_reg_dead_p (2, operands[0])
-   && fusion_p9_p (operands[0], operands[1], operands[2], operands[3])"
+   && fusion_p9_p (operands[0], operands[1], operands[2], operands[3])
+   && !TARGET_SOFT_FLOAT"
   [(const_int 0)]
 {
   expand_fusion_p9_load (operands);
@@ -13037,7 +13038,8 @@
(set (match_operand:SFDF 2 "offsettable_mem_operand" "")
 	(match_operand:SFDF 3 "toc_fusion_or_p9_reg_operand" ""))]
   "TARGET_P9_FUSION && peep2_reg_dead_p (2, operands[0])
-   && fusion_p9_p (operands[0], operands[1], operands[2], operands[3])"
+   && fusion_p9_p (operands[0], operands[1], operands[2], operands[3])
+   && !TARGET_SOFT_FLOAT"
   [(const_int 0)]
 {
   expand_fusion_p9_store (operands);
@@ -13050,7 +13052,8 @@
(set (match_dup 0)
 	(ior:SDI (match_dup 0)
 		 (match_operand:SDI 2 "u_short_cint_operand" "")))]
-  "TARGET_P9_FUSION"
+  "TARGET_P9_FUSION
+   && !TARGET_SOFT_FLOAT"
   [(set (match_dup 0)
 	(unspec:SDI [(match_dup 1)
 		 (match_dup 2)] UNSPEC_FUSION_P9))])
@@ -13063,7 +13066,8 @@
 		 (match_operand:SDI 3 "u_short_cint_operand" "")))]
   "TARGET_P9_FUSION
&& !rtx_equal_p (operands[0], operands[2])
-   && peep2_reg_dead_p (2, operands[0])"
+   && peep2_reg_dead_p (2, operands[0])
+   && !TARGET_SOFT_FLOAT"
   [(set (match_dup 2)
 	(unspec:SDI [(match_dup 1)
 		 (match_dup 3)] UNSPEC_FUSION_P9))])
diff --git a/gcc/testsuite/gcc.target/powerpc/fusion3.c b/gcc/testsuite/gcc.target/powerpc/fusion3.c
index 8eca640..20992d0 100644
--- a/gcc/testsuite/gcc.target/powerpc/fusion3.c
+++ b/gcc/testsuite/gcc.target/powerpc/fusion3.c
@@ -2,6 +2,7 @@
 /* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
 /* { dg-require-effective-target powerpc_p9vector_ok } */
 /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power7" } } */
+/* { dg-skip-if "-mpower9-fusion and -msoft-float are incompatible" { powerpc*-*-* } { "-msoft-float" } {} } */
 /* { dg-options "-mcpu=power7 -mtune=power9 -O3" } */
 
 #define LARGE 0x12345