Re: [Mesa-dev] [PATCH v3] R600/SI: Add pattern for AMDGPUurecip
Am 20.04.2013 06:06, schrieb Tom Stellard: On Thu, Apr 11, 2013 at 10:12:01AM +0200, Christian König wrote: Am 10.04.2013 18:50, schrieb Tom Stellard: On Wed, Apr 10, 2013 at 05:59:48PM +0200, Michel Dänzer wrote: [SNIP] We should start using the updated pattern syntax for all new patterns. This means replacing register classes with types for the input patterns and omitting the type in the output pattern: def : Pat < (AMDGPUurecip i32:$src0), (V_CVT_U32_F32_e32 (V_MUL_F32_e32 CONST.FP_UINT_MAX_PLUS_1, (V_RCP_IFLAG_F32_e32 (V_CVT_F32_U32_e32 $src0 With that change: Reviewed-by: Tom Stellard BTW: I created the attached patches two weeks ago. They rework most of the existing patterns on SI to use the new format, but I currently don't have time to rebase, test & commit them. They shouldn't change anything in functionality, so if you guys think they are ok then please review and commit them. Thanks for doing this. I've thrown these patches into a branch along with changes to the R600 patterns. I will try to test them next week. Is there any reason why we can't squash all these patches together before we commit? No not really. I just usually split patches up for testing each individually, so feel free to squash merge them for commit. Christian. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3] R600/SI: Add pattern for AMDGPUurecip
On Thu, Apr 11, 2013 at 10:12:01AM +0200, Christian König wrote: > Am 10.04.2013 18:50, schrieb Tom Stellard: > >On Wed, Apr 10, 2013 at 05:59:48PM +0200, Michel Dänzer wrote: > >>[SNIP] > >We should start using the updated pattern syntax for all new patterns. > >This means replacing register classes with types for the input patterns > >and omitting the type in the output pattern: > > > >def : Pat < > > (AMDGPUurecip i32:$src0), > > (V_CVT_U32_F32_e32 > > (V_MUL_F32_e32 CONST.FP_UINT_MAX_PLUS_1, > >(V_RCP_IFLAG_F32_e32 (V_CVT_F32_U32_e32 $src0 > > > >With that change: > > > >Reviewed-by: Tom Stellard > > BTW: I created the attached patches two weeks ago. They rework most > of the existing patterns on SI to use the new format, but I > currently don't have time to rebase, test & commit them. They > shouldn't change anything in functionality, so if you guys think > they are ok then please review and commit them. > Thanks for doing this. I've thrown these patches into a branch along with changes to the R600 patterns. I will try to test them next week. Is there any reason why we can't squash all these patches together before we commit? -Tom > Thanks, > Christian. > From f0175c616db5f6d3f1024137edbd8773c118f7dc Mon Sep 17 00:00:00 2001 > From: =?UTF-8?q?Christian=20K=C3=B6nig?= > Date: Thu, 28 Mar 2013 12:50:55 +0100 > Subject: [PATCH 1/9] R600/SI: remove nonsense select pattern > MIME-Version: 1.0 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > > Fortunately this pattern never matched, otherwise > we would have generated incorrect code. > > Signed-off-by: Christian K??nig > --- > lib/Target/R600/SIInstructions.td |9 + > 1 file changed, 1 insertion(+), 8 deletions(-) > > diff --git a/lib/Target/R600/SIInstructions.td > b/lib/Target/R600/SIInstructions.td > index eb410d7..e37003e 100644 > --- a/lib/Target/R600/SIInstructions.td > +++ b/lib/Target/R600/SIInstructions.td > @@ -1019,18 +1019,11 @@ def S_MAX_U32 : SOP2_32 <0x0009, "S_MAX_U32", []>; > def S_CSELECT_B32 : SOP2 < >0x000a, (outs SReg_32:$dst), >(ins SReg_32:$src0, SReg_32:$src1, SCCReg:$scc), "S_CSELECT_B32", > - [(set (i32 SReg_32:$dst), (select (i1 SCCReg:$scc), > - SReg_32:$src0, SReg_32:$src1))] > + [] > >; > > def S_CSELECT_B64 : SOP2_64 <0x000b, "S_CSELECT_B64", []>; > > -// f32 pattern for S_CSELECT_B32 > -def : Pat < > - (f32 (select (i1 SCCReg:$scc), SReg_32:$src0, SReg_32:$src1)), > - (S_CSELECT_B32 SReg_32:$src0, SReg_32:$src1, SCCReg:$scc) > ->; > - > def S_AND_B32 : SOP2_32 <0x000e, "S_AND_B32", []>; > > def S_AND_B64 : SOP2_64 <0x000f, "S_AND_B64", > -- > 1.7.10.4 > > From 7a2c0f084fa9ac949084a2c719d9944dd680a866 Mon Sep 17 00:00:00 2001 > From: =?UTF-8?q?Christian=20K=C3=B6nig?= > Date: Thu, 28 Mar 2013 11:18:00 +0100 > Subject: [PATCH 2/9] R600/SI: start reworking patterns > MIME-Version: 1.0 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > > We don't need register classes in patterns any longer. > Let's start with the indirect addressing patterns. > > Signed-off-by: Christian K??nig > --- > lib/Target/R600/SIInstructions.td | 36 ++-- > 1 file changed, 14 insertions(+), 22 deletions(-) > > diff --git a/lib/Target/R600/SIInstructions.td > b/lib/Target/R600/SIInstructions.td > index e37003e..6ee3923 100644 > --- a/lib/Target/R600/SIInstructions.td > +++ b/lib/Target/R600/SIInstructions.td > @@ -1542,45 +1542,37 @@ defm : SMRD_Pattern S_LOAD_DWORDX8_SGPR, v32i8>; > /** Indirect adressing **/ > /** == **/ > > -multiclass SI_INDIRECT_Pattern -SI_INDIRECT_DST IndDst> { > +multiclass SI_INDIRECT_Pattern { > + >// 1. Extract with offset >def : Pat< > -(vector_extract (vt rc:$vec), > - (i64 (zext (i32 (add VReg_32:$idx, imm:$off > -), > -(f32 (SI_INDIRECT_SRC (IMPLICIT_DEF), rc:$vec, VReg_32:$idx, imm:$off)) > +(vector_extract vt:$vec, (i64 (zext (add i32:$idx, imm:$off, > +(f32 (SI_INDIRECT_SRC (IMPLICIT_DEF), vt:$vec, VReg_32:$idx, imm:$off)) >>; > >// 2. Extract without offset >def : Pat< > -(vector_extract (vt rc:$vec), > - (i64 (zext (i32 VReg_32:$idx))) > -), > -(f32 (SI_INDIRECT_SRC (IMPLICIT_DEF), rc:$vec, VReg_32:$idx, 0)) > +(vector_extract vt:$vec, (i64 (zext i32:$idx))), > +(f32 (SI_INDIRECT_SRC (IMPLICIT_DEF), vt:$vec, i32:$idx, 0)) >>; > >// 3. Insert with offset >def : Pat< > -(vector_insert (vt rc:$vec), (f32 VReg_32:$val), > - (i64 (zext (i32 (add VReg_32:$idx, imm:$off > -), > -(vt (IndDst (IMPLICIT_DEF), rc:$vec, VReg_32:$idx, imm:$off, > VReg_32:$val)) > +(vector_insert vt:$vec, f32:$val, (i64 (zext (add i32:$idx, imm:$off, > +(IndDst (IMPLICIT_DEF), vt:$vec, i32:$idx,
Re: [Mesa-dev] [PATCH v3] R600/SI: Add pattern for AMDGPUurecip
Am 10.04.2013 18:50, schrieb Tom Stellard: On Wed, Apr 10, 2013 at 05:59:48PM +0200, Michel Dänzer wrote: [SNIP] We should start using the updated pattern syntax for all new patterns. This means replacing register classes with types for the input patterns and omitting the type in the output pattern: def : Pat < (AMDGPUurecip i32:$src0), (V_CVT_U32_F32_e32 (V_MUL_F32_e32 CONST.FP_UINT_MAX_PLUS_1, (V_RCP_IFLAG_F32_e32 (V_CVT_F32_U32_e32 $src0 With that change: Reviewed-by: Tom Stellard BTW: I created the attached patches two weeks ago. They rework most of the existing patterns on SI to use the new format, but I currently don't have time to rebase, test & commit them. They shouldn't change anything in functionality, so if you guys think they are ok then please review and commit them. Thanks, Christian. >From f0175c616db5f6d3f1024137edbd8773c118f7dc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Christian=20K=C3=B6nig?= Date: Thu, 28 Mar 2013 12:50:55 +0100 Subject: [PATCH 1/9] R600/SI: remove nonsense select pattern MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fortunately this pattern never matched, otherwise we would have generated incorrect code. Signed-off-by: Christian König --- lib/Target/R600/SIInstructions.td |9 + 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/lib/Target/R600/SIInstructions.td b/lib/Target/R600/SIInstructions.td index eb410d7..e37003e 100644 --- a/lib/Target/R600/SIInstructions.td +++ b/lib/Target/R600/SIInstructions.td @@ -1019,18 +1019,11 @@ def S_MAX_U32 : SOP2_32 <0x0009, "S_MAX_U32", []>; def S_CSELECT_B32 : SOP2 < 0x000a, (outs SReg_32:$dst), (ins SReg_32:$src0, SReg_32:$src1, SCCReg:$scc), "S_CSELECT_B32", - [(set (i32 SReg_32:$dst), (select (i1 SCCReg:$scc), - SReg_32:$src0, SReg_32:$src1))] + [] >; def S_CSELECT_B64 : SOP2_64 <0x000b, "S_CSELECT_B64", []>; -// f32 pattern for S_CSELECT_B32 -def : Pat < - (f32 (select (i1 SCCReg:$scc), SReg_32:$src0, SReg_32:$src1)), - (S_CSELECT_B32 SReg_32:$src0, SReg_32:$src1, SCCReg:$scc) ->; - def S_AND_B32 : SOP2_32 <0x000e, "S_AND_B32", []>; def S_AND_B64 : SOP2_64 <0x000f, "S_AND_B64", -- 1.7.10.4 >From 7a2c0f084fa9ac949084a2c719d9944dd680a866 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Christian=20K=C3=B6nig?= Date: Thu, 28 Mar 2013 11:18:00 +0100 Subject: [PATCH 2/9] R600/SI: start reworking patterns MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit We don't need register classes in patterns any longer. Let's start with the indirect addressing patterns. Signed-off-by: Christian König --- lib/Target/R600/SIInstructions.td | 36 ++-- 1 file changed, 14 insertions(+), 22 deletions(-) diff --git a/lib/Target/R600/SIInstructions.td b/lib/Target/R600/SIInstructions.td index e37003e..6ee3923 100644 --- a/lib/Target/R600/SIInstructions.td +++ b/lib/Target/R600/SIInstructions.td @@ -1542,45 +1542,37 @@ defm : SMRD_Pattern ; /** Indirect adressing **/ /** == **/ -multiclass SI_INDIRECT_Pattern { +multiclass SI_INDIRECT_Pattern { + // 1. Extract with offset def : Pat< -(vector_extract (vt rc:$vec), - (i64 (zext (i32 (add VReg_32:$idx, imm:$off -), -(f32 (SI_INDIRECT_SRC (IMPLICIT_DEF), rc:$vec, VReg_32:$idx, imm:$off)) +(vector_extract vt:$vec, (i64 (zext (add i32:$idx, imm:$off, +(f32 (SI_INDIRECT_SRC (IMPLICIT_DEF), vt:$vec, VReg_32:$idx, imm:$off)) >; // 2. Extract without offset def : Pat< -(vector_extract (vt rc:$vec), - (i64 (zext (i32 VReg_32:$idx))) -), -(f32 (SI_INDIRECT_SRC (IMPLICIT_DEF), rc:$vec, VReg_32:$idx, 0)) +(vector_extract vt:$vec, (i64 (zext i32:$idx))), +(f32 (SI_INDIRECT_SRC (IMPLICIT_DEF), vt:$vec, i32:$idx, 0)) >; // 3. Insert with offset def : Pat< -(vector_insert (vt rc:$vec), (f32 VReg_32:$val), - (i64 (zext (i32 (add VReg_32:$idx, imm:$off -), -(vt (IndDst (IMPLICIT_DEF), rc:$vec, VReg_32:$idx, imm:$off, VReg_32:$val)) +(vector_insert vt:$vec, f32:$val, (i64 (zext (add i32:$idx, imm:$off, +(IndDst (IMPLICIT_DEF), vt:$vec, i32:$idx, imm:$off, f32:$val) >; // 4. Insert without offset def : Pat< -(vector_insert (vt rc:$vec), (f32 VReg_32:$val), - (i64 (zext (i32 VReg_32:$idx))) -), -(vt (IndDst (IMPLICIT_DEF), rc:$vec, VReg_32:$idx, 0, VReg_32:$val)) +(vector_insert vt:$vec, f32:$val, (i64 (zext i32:$idx))), +(IndDst (IMPLICIT_DEF), vt:$vec, i32:$idx, 0, f32:$val) >; } -defm : SI_INDIRECT_Pattern ; -defm : SI_INDIRECT_Pattern ; -defm : SI_INDIRECT_Pattern ; -defm : SI_INDIRECT_Pattern ; +defm : SI_INDIRECT_Pattern ; +defm : SI_INDIRECT_Pattern ; +defm : SI_INDIRECT_Pattern ; +defm : SI_INDIRECT_Pattern ; /** ===
Re: [Mesa-dev] [PATCH v3] R600/SI: Add pattern for AMDGPUurecip
On Wed, Apr 10, 2013 at 05:59:48PM +0200, Michel Dänzer wrote: > From: Michel Dänzer > > 21 more little piglits with radeonsi. > > Signed-off-by: Michel Dänzer > --- > > v3: Use constant for and add comments about scaling multiplications > > lib/Target/R600/AMDGPUInstructions.td | 1 + > lib/Target/R600/R600Instructions.td | 3 ++- > lib/Target/R600/SIInstructions.td | 12 ++-- > test/CodeGen/R600/urecip.ll | 12 > 4 files changed, 25 insertions(+), 3 deletions(-) > create mode 100644 test/CodeGen/R600/urecip.ll > > diff --git a/lib/Target/R600/AMDGPUInstructions.td > b/lib/Target/R600/AMDGPUInstructions.td > index e740348..fa890c1 100644 > --- a/lib/Target/R600/AMDGPUInstructions.td > +++ b/lib/Target/R600/AMDGPUInstructions.td > @@ -94,6 +94,7 @@ class Constants { > int TWO_PI = 0x40c90fdb; > int PI = 0x40490fdb; > int TWO_PI_INV = 0x3e22f983; > +int FP_UINT_MAX_PLUS_1 = 0x4f80; // 1 << 32 in floating point encoding > } > def CONST : Constants; > > diff --git a/lib/Target/R600/R600Instructions.td > b/lib/Target/R600/R600Instructions.td > index b4c45e1..8ede6cc 100644 > --- a/lib/Target/R600/R600Instructions.td > +++ b/lib/Target/R600/R600Instructions.td > @@ -1923,10 +1923,11 @@ def : COS_PAT ; > defm DIV_cm : DIV_Common; > > // RECIP_UINT emulation for Cayman > +// The multiplication scales from [0,1] to the unsigned integer range > def : Pat < >(AMDGPUurecip R600_Reg32:$src0), >(FLT_TO_UINT_eg (MUL_IEEE (RECIP_IEEE_cm (UINT_TO_FLT_eg > R600_Reg32:$src0)), > -(MOV_IMM_I32 0x4f80))) > +(MOV_IMM_I32 CONST.FP_UINT_MAX_PLUS_1))) > >; > > > diff --git a/lib/Target/R600/SIInstructions.td > b/lib/Target/R600/SIInstructions.td > index e2a08fc..0226d5a 100644 > --- a/lib/Target/R600/SIInstructions.td > +++ b/lib/Target/R600/SIInstructions.td > @@ -602,8 +602,8 @@ defm V_READFIRSTLANE_B32 : VOP1_32 <0x0002, > "V_READFIRSTLANE_B32", []>; > defm V_CVT_F32_I32 : VOP1_32 <0x0005, "V_CVT_F32_I32", >[(set VReg_32:$dst, (sint_to_fp VSrc_32:$src0))] > >; > -//defm V_CVT_F32_U32 : VOP1_32 <0x0006, "V_CVT_F32_U32", []>; > -//defm V_CVT_U32_F32 : VOP1_32 <0x0007, "V_CVT_U32_F32", []>; > +defm V_CVT_F32_U32 : VOP1_32 <0x0006, "V_CVT_F32_U32", []>; > +defm V_CVT_U32_F32 : VOP1_32 <0x0007, "V_CVT_U32_F32", []>; > defm V_CVT_I32_F32 : VOP1_32 <0x0008, "V_CVT_I32_F32", >[(set (i32 VReg_32:$dst), (fp_to_sint VSrc_32:$src0))] > >; > @@ -1514,6 +1514,14 @@ def : Pat < >(BUFFER_LOAD_DWORD 0, 1, 0, 0, 0, 0, VReg_32:$voff, SReg_128:$sbase, 0, 0, > 0) > >; > > +// The multiplication scales from [0,1] to the unsigned integer range > +def : Pat < > + (AMDGPUurecip VSrc_32:$src0), > + (V_CVT_U32_F32_e32 > +(V_MUL_F32_e32 CONST.FP_UINT_MAX_PLUS_1, > + (V_RCP_IFLAG_F32_e32 (V_CVT_F32_U32_e32 VSrc_32:$src0 > +>; > + We should start using the updated pattern syntax for all new patterns. This means replacing register classes with types for the input patterns and omitting the type in the output pattern: def : Pat < (AMDGPUurecip i32:$src0), (V_CVT_U32_F32_e32 (V_MUL_F32_e32 CONST.FP_UINT_MAX_PLUS_1, (V_RCP_IFLAG_F32_e32 (V_CVT_F32_U32_e32 $src0 With that change: Reviewed-by: Tom Stellard > /** == **/ > /** VOP3 Patterns**/ > /** == **/ > diff --git a/test/CodeGen/R600/urecip.ll b/test/CodeGen/R600/urecip.ll > new file mode 100644 > index 000..dad02dd > --- /dev/null > +++ b/test/CodeGen/R600/urecip.ll > @@ -0,0 +1,12 @@ > +;RUN: llc < %s -march=r600 -mcpu=verde | FileCheck %s > + > +;CHECK: V_RCP_IFLAG_F32_e32 > + > +define void @test(i32 %p, i32 %q) { > + %i = udiv i32 %p, %q > + %r = bitcast i32 %i to float > + call void @llvm.SI.export(i32 15, i32 0, i32 1, i32 12, i32 0, float %r, > float %r, float %r, float %r) > + ret void > +} > + > +declare void @llvm.SI.export(i32, i32, i32, i32, i32, float, float, float, > float) > -- > 1.8.2 > > ___ > llvm-commits mailing list > llvm-comm...@cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3] R600/SI: Add pattern for AMDGPUurecip
From: Michel Dänzer 21 more little piglits with radeonsi. Signed-off-by: Michel Dänzer --- v3: Use constant for and add comments about scaling multiplications lib/Target/R600/AMDGPUInstructions.td | 1 + lib/Target/R600/R600Instructions.td | 3 ++- lib/Target/R600/SIInstructions.td | 12 ++-- test/CodeGen/R600/urecip.ll | 12 4 files changed, 25 insertions(+), 3 deletions(-) create mode 100644 test/CodeGen/R600/urecip.ll diff --git a/lib/Target/R600/AMDGPUInstructions.td b/lib/Target/R600/AMDGPUInstructions.td index e740348..fa890c1 100644 --- a/lib/Target/R600/AMDGPUInstructions.td +++ b/lib/Target/R600/AMDGPUInstructions.td @@ -94,6 +94,7 @@ class Constants { int TWO_PI = 0x40c90fdb; int PI = 0x40490fdb; int TWO_PI_INV = 0x3e22f983; +int FP_UINT_MAX_PLUS_1 = 0x4f80; // 1 << 32 in floating point encoding } def CONST : Constants; diff --git a/lib/Target/R600/R600Instructions.td b/lib/Target/R600/R600Instructions.td index b4c45e1..8ede6cc 100644 --- a/lib/Target/R600/R600Instructions.td +++ b/lib/Target/R600/R600Instructions.td @@ -1923,10 +1923,11 @@ def : COS_PAT ; defm DIV_cm : DIV_Common; // RECIP_UINT emulation for Cayman +// The multiplication scales from [0,1] to the unsigned integer range def : Pat < (AMDGPUurecip R600_Reg32:$src0), (FLT_TO_UINT_eg (MUL_IEEE (RECIP_IEEE_cm (UINT_TO_FLT_eg R600_Reg32:$src0)), -(MOV_IMM_I32 0x4f80))) +(MOV_IMM_I32 CONST.FP_UINT_MAX_PLUS_1))) >; diff --git a/lib/Target/R600/SIInstructions.td b/lib/Target/R600/SIInstructions.td index e2a08fc..0226d5a 100644 --- a/lib/Target/R600/SIInstructions.td +++ b/lib/Target/R600/SIInstructions.td @@ -602,8 +602,8 @@ defm V_READFIRSTLANE_B32 : VOP1_32 <0x0002, "V_READFIRSTLANE_B32", []>; defm V_CVT_F32_I32 : VOP1_32 <0x0005, "V_CVT_F32_I32", [(set VReg_32:$dst, (sint_to_fp VSrc_32:$src0))] >; -//defm V_CVT_F32_U32 : VOP1_32 <0x0006, "V_CVT_F32_U32", []>; -//defm V_CVT_U32_F32 : VOP1_32 <0x0007, "V_CVT_U32_F32", []>; +defm V_CVT_F32_U32 : VOP1_32 <0x0006, "V_CVT_F32_U32", []>; +defm V_CVT_U32_F32 : VOP1_32 <0x0007, "V_CVT_U32_F32", []>; defm V_CVT_I32_F32 : VOP1_32 <0x0008, "V_CVT_I32_F32", [(set (i32 VReg_32:$dst), (fp_to_sint VSrc_32:$src0))] >; @@ -1514,6 +1514,14 @@ def : Pat < (BUFFER_LOAD_DWORD 0, 1, 0, 0, 0, 0, VReg_32:$voff, SReg_128:$sbase, 0, 0, 0) >; +// The multiplication scales from [0,1] to the unsigned integer range +def : Pat < + (AMDGPUurecip VSrc_32:$src0), + (V_CVT_U32_F32_e32 +(V_MUL_F32_e32 CONST.FP_UINT_MAX_PLUS_1, + (V_RCP_IFLAG_F32_e32 (V_CVT_F32_U32_e32 VSrc_32:$src0 +>; + /** == **/ /** VOP3 Patterns**/ /** == **/ diff --git a/test/CodeGen/R600/urecip.ll b/test/CodeGen/R600/urecip.ll new file mode 100644 index 000..dad02dd --- /dev/null +++ b/test/CodeGen/R600/urecip.ll @@ -0,0 +1,12 @@ +;RUN: llc < %s -march=r600 -mcpu=verde | FileCheck %s + +;CHECK: V_RCP_IFLAG_F32_e32 + +define void @test(i32 %p, i32 %q) { + %i = udiv i32 %p, %q + %r = bitcast i32 %i to float + call void @llvm.SI.export(i32 15, i32 0, i32 1, i32 12, i32 0, float %r, float %r, float %r, float %r) + ret void +} + +declare void @llvm.SI.export(i32, i32, i32, i32, i32, float, float, float, float) -- 1.8.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev