Re: [Mesa-dev] R600/SI: Support for local memory and derivatives
On Fre, 2013-06-28 at 14:37 -0700, Tom Stellard wrote: On Wed, Jun 19, 2013 at 06:28:21PM +0200, Michel Dänzer wrote: These patches implement enough of local memory support to allow radeonsi to use that for computing derivatives, as suggested by Tom. They also almost allow test/CodeGen/R600/local-memory.ll to generate code for SI. Right now it still fails because it tries to copy a VGPR to an SGPR, which is not possible. Can you add some lit tests for these new intrinsics Done, updated patches attached. and also add CHECK lines for SI to the existing local-memory.ll test. Can't do that while it still fails to generate SI code. Should I commit the other patches anyway, which are only necessary for that test? -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer From 3572bab6a6b5c967d19add0b0497a96123754ec2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michel=20D=C3=A4nzer?= michel.daen...@amd.com Date: Thu, 21 Feb 2013 16:12:45 +0100 Subject: [PATCH v2 1/4] R600/SI: Add intrinsics for texture sampling with user derivatives MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Michel Dänzer michel.daen...@amd.com --- v2: Add lit test lib/Target/R600/SIInstructions.td| 7 +- lib/Target/R600/SIIntrinsics.td | 1 + test/CodeGen/R600/llvm.SI.sampled.ll | 140 +++ 3 files changed, 147 insertions(+), 1 deletion(-) create mode 100644 test/CodeGen/R600/llvm.SI.sampled.ll diff --git a/lib/Target/R600/SIInstructions.td b/lib/Target/R600/SIInstructions.td index 9c96c08..c9eac7d 100644 --- a/lib/Target/R600/SIInstructions.td +++ b/lib/Target/R600/SIInstructions.td @@ -535,7 +535,7 @@ def IMAGE_SAMPLE_B : MIMG_Sampler_Helper 0x0025, IMAGE_SAMPLE_B; //def IMAGE_SAMPLE_LZ : MIMG_NoPattern_ IMAGE_SAMPLE_LZ, 0x0027; def IMAGE_SAMPLE_C : MIMG_Sampler_Helper 0x0028, IMAGE_SAMPLE_C; //def IMAGE_SAMPLE_C_CL : MIMG_NoPattern_ IMAGE_SAMPLE_C_CL, 0x0029; -//def IMAGE_SAMPLE_C_D : MIMG_NoPattern_ IMAGE_SAMPLE_C_D, 0x002a; +def IMAGE_SAMPLE_C_D : MIMG_Sampler_Helper 0x002a, IMAGE_SAMPLE_C_D; //def IMAGE_SAMPLE_C_D_CL : MIMG_NoPattern_ IMAGE_SAMPLE_C_D_CL, 0x002b; def IMAGE_SAMPLE_C_L : MIMG_Sampler_Helper 0x002c, IMAGE_SAMPLE_C_L; def IMAGE_SAMPLE_C_B : MIMG_Sampler_Helper 0x002d, IMAGE_SAMPLE_C_B; @@ -1296,6 +1296,11 @@ multiclass SamplePatternsValueType addr_type { def : SampleArrayPattern int_SI_sampleb, IMAGE_SAMPLE_B, addr_type; def : SampleShadowPattern int_SI_sampleb, IMAGE_SAMPLE_C_B, addr_type; def : SampleShadowArrayPattern int_SI_sampleb, IMAGE_SAMPLE_C_B, addr_type; + + def : SamplePattern int_SI_sampled, IMAGE_SAMPLE_D, addr_type; + def : SampleArrayPattern int_SI_sampled, IMAGE_SAMPLE_D, addr_type; + def : SampleShadowPattern int_SI_sampled, IMAGE_SAMPLE_C_D, addr_type; + def : SampleShadowArrayPattern int_SI_sampled, IMAGE_SAMPLE_C_D, addr_type; } defm : SamplePatternsv2i32; diff --git a/lib/Target/R600/SIIntrinsics.td b/lib/Target/R600/SIIntrinsics.td index 224cd2f..d2643e0 100644 --- a/lib/Target/R600/SIIntrinsics.td +++ b/lib/Target/R600/SIIntrinsics.td @@ -23,6 +23,7 @@ let TargetPrefix = SI, isTarget = 1 in { def int_SI_sample : Sample; def int_SI_sampleb : Sample; + def int_SI_sampled : Sample; def int_SI_samplel : Sample; def int_SI_imageload : Intrinsic [llvm_v4i32_ty], [llvm_anyvector_ty, llvm_v32i8_ty, llvm_i32_ty], [IntrNoMem]; diff --git a/test/CodeGen/R600/llvm.SI.sampled.ll b/test/CodeGen/R600/llvm.SI.sampled.ll new file mode 100644 index 000..71b8ef5 --- /dev/null +++ b/test/CodeGen/R600/llvm.SI.sampled.ll @@ -0,0 +1,140 @@ +;RUN: llc %s -march=r600 -mcpu=verde | FileCheck %s + +;CHECK: IMAGE_SAMPLE_D {{VGPR[0-9]+_VGPR[0-9]+_VGPR[0-9]+_VGPR[0-9]+}}, 15 +;CHECK: IMAGE_SAMPLE_D {{VGPR[0-9]+_VGPR[0-9]+}}, 3 +;CHECK: IMAGE_SAMPLE_D {{VGPR[0-9]+}}, 2 +;CHECK: IMAGE_SAMPLE_D {{VGPR[0-9]+}}, 1 +;CHECK: IMAGE_SAMPLE_D {{VGPR[0-9]+}}, 4 +;CHECK: IMAGE_SAMPLE_D {{VGPR[0-9]+}}, 8 +;CHECK: IMAGE_SAMPLE_C_D {{VGPR[0-9]+_VGPR[0-9]+}}, 5 +;CHECK: IMAGE_SAMPLE_C_D {{VGPR[0-9]+_VGPR[0-9]+}}, 9 +;CHECK: IMAGE_SAMPLE_C_D {{VGPR[0-9]+_VGPR[0-9]+}}, 6 +;CHECK: IMAGE_SAMPLE_D {{VGPR[0-9]+_VGPR[0-9]+}}, 10 +;CHECK: IMAGE_SAMPLE_D {{VGPR[0-9]+_VGPR[0-9]+}}, 12 +;CHECK: IMAGE_SAMPLE_C_D {{VGPR[0-9]+_VGPR[0-9]+_VGPR[0-9]+}}, 7 +;CHECK: IMAGE_SAMPLE_C_D {{VGPR[0-9]+_VGPR[0-9]+_VGPR[0-9]+}}, 11 +;CHECK: IMAGE_SAMPLE_C_D {{VGPR[0-9]+_VGPR[0-9]+_VGPR[0-9]+}}, 13 +;CHECK: IMAGE_SAMPLE_D {{VGPR[0-9]+_VGPR[0-9]+_VGPR[0-9]+}}, 14 +;CHECK: IMAGE_SAMPLE_D {{VGPR[0-9]+}}, 8 + +define void @test(i32 %a1, i32 %a2, i32 %a3, i32 %a4) { + %v1 = insertelement 4 x i32 undef, i32 %a1, i32 0 + %v2 = insertelement 4 x i32 undef, i32 %a1, i32 1 + %v3 = insertelement 4 x i32 undef, i32 %a1, i32 2 + %v4 = insertelement 4 x i32 undef, i32 %a1, i32 3
Re: [Mesa-dev] R600/SI: Support for local memory and derivatives
On Wed, Jul 10, 2013 at 12:32:25PM +0200, Michel Dänzer wrote: On Fre, 2013-06-28 at 14:37 -0700, Tom Stellard wrote: On Wed, Jun 19, 2013 at 06:28:21PM +0200, Michel Dänzer wrote: These patches implement enough of local memory support to allow radeonsi to use that for computing derivatives, as suggested by Tom. They also almost allow test/CodeGen/R600/local-memory.ll to generate code for SI. Right now it still fails because it tries to copy a VGPR to an SGPR, which is not possible. Can you add some lit tests for these new intrinsics Done, updated patches attached. and also add CHECK lines for SI to the existing local-memory.ll test. Can't do that while it still fails to generate SI code. Should I commit the other patches anyway, which are only necessary for that test? Can you add a TODO comment to that test for adding SI checks? With that change, the patches are: Reviewed-by: Tom Stellard thomas.stell...@amd.com -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer From 3572bab6a6b5c967d19add0b0497a96123754ec2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michel=20D=C3=A4nzer?= michel.daen...@amd.com Date: Thu, 21 Feb 2013 16:12:45 +0100 Subject: [PATCH v2 1/4] R600/SI: Add intrinsics for texture sampling with user derivatives MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Michel Dänzer michel.daen...@amd.com --- v2: Add lit test lib/Target/R600/SIInstructions.td| 7 +- lib/Target/R600/SIIntrinsics.td | 1 + test/CodeGen/R600/llvm.SI.sampled.ll | 140 +++ 3 files changed, 147 insertions(+), 1 deletion(-) create mode 100644 test/CodeGen/R600/llvm.SI.sampled.ll diff --git a/lib/Target/R600/SIInstructions.td b/lib/Target/R600/SIInstructions.td index 9c96c08..c9eac7d 100644 --- a/lib/Target/R600/SIInstructions.td +++ b/lib/Target/R600/SIInstructions.td @@ -535,7 +535,7 @@ def IMAGE_SAMPLE_B : MIMG_Sampler_Helper 0x0025, IMAGE_SAMPLE_B; //def IMAGE_SAMPLE_LZ : MIMG_NoPattern_ IMAGE_SAMPLE_LZ, 0x0027; def IMAGE_SAMPLE_C : MIMG_Sampler_Helper 0x0028, IMAGE_SAMPLE_C; //def IMAGE_SAMPLE_C_CL : MIMG_NoPattern_ IMAGE_SAMPLE_C_CL, 0x0029; -//def IMAGE_SAMPLE_C_D : MIMG_NoPattern_ IMAGE_SAMPLE_C_D, 0x002a; +def IMAGE_SAMPLE_C_D : MIMG_Sampler_Helper 0x002a, IMAGE_SAMPLE_C_D; //def IMAGE_SAMPLE_C_D_CL : MIMG_NoPattern_ IMAGE_SAMPLE_C_D_CL, 0x002b; def IMAGE_SAMPLE_C_L : MIMG_Sampler_Helper 0x002c, IMAGE_SAMPLE_C_L; def IMAGE_SAMPLE_C_B : MIMG_Sampler_Helper 0x002d, IMAGE_SAMPLE_C_B; @@ -1296,6 +1296,11 @@ multiclass SamplePatternsValueType addr_type { def : SampleArrayPattern int_SI_sampleb, IMAGE_SAMPLE_B, addr_type; def : SampleShadowPattern int_SI_sampleb, IMAGE_SAMPLE_C_B, addr_type; def : SampleShadowArrayPattern int_SI_sampleb, IMAGE_SAMPLE_C_B, addr_type; + + def : SamplePattern int_SI_sampled, IMAGE_SAMPLE_D, addr_type; + def : SampleArrayPattern int_SI_sampled, IMAGE_SAMPLE_D, addr_type; + def : SampleShadowPattern int_SI_sampled, IMAGE_SAMPLE_C_D, addr_type; + def : SampleShadowArrayPattern int_SI_sampled, IMAGE_SAMPLE_C_D, addr_type; } defm : SamplePatternsv2i32; diff --git a/lib/Target/R600/SIIntrinsics.td b/lib/Target/R600/SIIntrinsics.td index 224cd2f..d2643e0 100644 --- a/lib/Target/R600/SIIntrinsics.td +++ b/lib/Target/R600/SIIntrinsics.td @@ -23,6 +23,7 @@ let TargetPrefix = SI, isTarget = 1 in { def int_SI_sample : Sample; def int_SI_sampleb : Sample; + def int_SI_sampled : Sample; def int_SI_samplel : Sample; def int_SI_imageload : Intrinsic [llvm_v4i32_ty], [llvm_anyvector_ty, llvm_v32i8_ty, llvm_i32_ty], [IntrNoMem]; diff --git a/test/CodeGen/R600/llvm.SI.sampled.ll b/test/CodeGen/R600/llvm.SI.sampled.ll new file mode 100644 index 000..71b8ef5 --- /dev/null +++ b/test/CodeGen/R600/llvm.SI.sampled.ll @@ -0,0 +1,140 @@ +;RUN: llc %s -march=r600 -mcpu=verde | FileCheck %s + +;CHECK: IMAGE_SAMPLE_D {{VGPR[0-9]+_VGPR[0-9]+_VGPR[0-9]+_VGPR[0-9]+}}, 15 +;CHECK: IMAGE_SAMPLE_D {{VGPR[0-9]+_VGPR[0-9]+}}, 3 +;CHECK: IMAGE_SAMPLE_D {{VGPR[0-9]+}}, 2 +;CHECK: IMAGE_SAMPLE_D {{VGPR[0-9]+}}, 1 +;CHECK: IMAGE_SAMPLE_D {{VGPR[0-9]+}}, 4 +;CHECK: IMAGE_SAMPLE_D {{VGPR[0-9]+}}, 8 +;CHECK: IMAGE_SAMPLE_C_D {{VGPR[0-9]+_VGPR[0-9]+}}, 5 +;CHECK: IMAGE_SAMPLE_C_D {{VGPR[0-9]+_VGPR[0-9]+}}, 9 +;CHECK: IMAGE_SAMPLE_C_D {{VGPR[0-9]+_VGPR[0-9]+}}, 6 +;CHECK: IMAGE_SAMPLE_D {{VGPR[0-9]+_VGPR[0-9]+}}, 10 +;CHECK: IMAGE_SAMPLE_D {{VGPR[0-9]+_VGPR[0-9]+}}, 12 +;CHECK: IMAGE_SAMPLE_C_D {{VGPR[0-9]+_VGPR[0-9]+_VGPR[0-9]+}}, 7 +;CHECK: IMAGE_SAMPLE_C_D {{VGPR[0-9]+_VGPR[0-9]+_VGPR[0-9]+}}, 11 +;CHECK: IMAGE_SAMPLE_C_D {{VGPR[0-9]+_VGPR[0-9]+_VGPR[0-9]+}}, 13 +;CHECK: IMAGE_SAMPLE_D
Re: [Mesa-dev] R600/SI: Support for local memory and derivatives
On Mit, 2013-07-10 at 08:15 -0700, Tom Stellard wrote: On Wed, Jul 10, 2013 at 12:32:25PM +0200, Michel Dänzer wrote: On Fre, 2013-06-28 at 14:37 -0700, Tom Stellard wrote: On Wed, Jun 19, 2013 at 06:28:21PM +0200, Michel Dänzer wrote: These patches implement enough of local memory support to allow radeonsi to use that for computing derivatives, as suggested by Tom. They also almost allow test/CodeGen/R600/local-memory.ll to generate code for SI. Right now it still fails because it tries to copy a VGPR to an SGPR, which is not possible. Can you add some lit tests for these new intrinsics Done, updated patches attached. and also add CHECK lines for SI to the existing local-memory.ll test. Can't do that while it still fails to generate SI code. Should I commit the other patches anyway, which are only necessary for that test? Can you add a TODO comment to that test for adding SI checks? With that change, the patches are: Reviewed-by: Tom Stellard thomas.stell...@amd.com Thanks, I managed to enable basic lit testing after all, see the attached patches 4 and 5. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer From 0f11058228a2c6504ed78f9856e6de3f8af0c0e8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michel=20D=C3=A4nzer?= michel.daen...@amd.com Date: Wed, 19 Jun 2013 11:01:00 +0200 Subject: [PATCH 4/5] R600/SI: Add pattern for the AMDGPU.barrier.local intrinsic MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit lit test coverage to follow in the next commit. Reviewed-by: Tom Stellard thomas.stell...@amd.com Signed-off-by: Michel Dänzer michel.daen...@amd.com --- lib/Target/R600/SIInstructions.td | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/lib/Target/R600/SIInstructions.td b/lib/Target/R600/SIInstructions.td index 61755b4..30f2a4a 100644 --- a/lib/Target/R600/SIInstructions.td +++ b/lib/Target/R600/SIInstructions.td @@ -774,8 +774,17 @@ def S_CBRANCH_EXECNZ : SOPP } // End isBranch = 1 } // End isTerminator = 1 -//def S_BARRIER : SOPP_ 0x000a, S_BARRIER, []; let hasSideEffects = 1 in { +def S_BARRIER : SOPP 0x000a, (ins), S_BARRIER, + [(int_AMDGPU_barrier_local)] + { + let SIMM16 = 0; + let isBarrier = 1; + let hasCtrlDep = 1; + let mayLoad = 1; + let mayStore = 1; +} + def S_WAITCNT : SOPP 0x000c, (ins i32imm:$simm16), S_WAITCNT $simm16, [] ; -- 1.8.3.2 From 09715a4574c2e35b02176516f542bc0d1d0dc132 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michel=20D=C3=A4nzer?= michel.daen...@amd.com Date: Mon, 17 Jun 2013 12:21:29 +0200 Subject: [PATCH v2 5/5] R600/SI: Initial local memory support MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Enough for the radeonsi driver to use it for calculating derivatives. Reviewed-by: Tom Stellard thomas.stell...@amd.com Signed-off-by: Michel Dänzer michel.daen...@amd.com --- v2: Enable some lit testing of local memory on SI. lib/Target/R600/AMDGPUAsmPrinter.cpp | 7 +++ lib/Target/R600/AMDGPUISelLowering.cpp| 4 +- lib/Target/R600/R600ISelLowering.cpp | 2 + lib/Target/R600/SIDefines.h | 4 ++ lib/Target/R600/SIISelLowering.cpp| 5 ++ lib/Target/R600/SIInstructions.td | 15 ++ test/CodeGen/R600/local-memory-two-objects.ll | 51 test/CodeGen/R600/local-memory.ll | 67 ++- 8 files changed, 100 insertions(+), 55 deletions(-) create mode 100644 test/CodeGen/R600/local-memory-two-objects.ll diff --git a/lib/Target/R600/AMDGPUAsmPrinter.cpp b/lib/Target/R600/AMDGPUAsmPrinter.cpp index 996d2a6..e039b77 100644 --- a/lib/Target/R600/AMDGPUAsmPrinter.cpp +++ b/lib/Target/R600/AMDGPUAsmPrinter.cpp @@ -233,7 +233,14 @@ void AMDGPUAsmPrinter::EmitProgramInfoSI(MachineFunction MF) { OutStreamer.EmitIntValue(RsrcReg, 4); OutStreamer.EmitIntValue(S_00B028_VGPRS(MaxVGPR / 4) | S_00B028_SGPRS(MaxSGPR / 8), 4); + + if (MFI-ShaderType == ShaderType::COMPUTE) { +OutStreamer.EmitIntValue(R_00B84C_COMPUTE_PGM_RSRC2, 4); +OutStreamer.EmitIntValue(S_00B84C_LDS_SIZE(RoundUpToAlignment(MFI-LDSSize, 256) 8), 4); + } if (MFI-ShaderType == ShaderType::PIXEL) { +OutStreamer.EmitIntValue(R_00B02C_SPI_SHADER_PGM_RSRC2_PS, 4); +OutStreamer.EmitIntValue(S_00B02C_EXTRA_LDS_SIZE(RoundUpToAlignment(MFI-LDSSize, 256) 8), 4); OutStreamer.EmitIntValue(R_0286CC_SPI_PS_INPUT_ENA, 4); OutStreamer.EmitIntValue(MFI-PSInputAddr, 4); } diff --git a/lib/Target/R600/AMDGPUISelLowering.cpp b/lib/Target/R600/AMDGPUISelLowering.cpp index 4019a1f..7fad3bb 100644 --- a/lib/Target/R600/AMDGPUISelLowering.cpp +++ b/lib/Target/R600/AMDGPUISelLowering.cpp @@ -72,8 +72,6 @@ AMDGPUTargetLowering::AMDGPUTargetLowering(TargetMachine TM) :
Re: [Mesa-dev] R600/SI: Support for local memory and derivatives
On Wed, Jun 19, 2013 at 06:28:21PM +0200, Michel Dänzer wrote: These patches implement enough of local memory support to allow radeonsi to use that for computing derivatives, as suggested by Tom. They also almost allow test/CodeGen/R600/local-memory.ll to generate code for SI. Right now it still fails because it tries to copy a VGPR to an SGPR, which is not possible. Can you add some lit tests for these new intrinsics and also add CHECK lines for SI to the existing local-memory.ll test. With the tests added, these patches are: Reviewed-by: Tom Stellard thomas.stell...@amd.com -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer From f4ca359c4536aa53122b654196f2e007d50976f8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michel=20D=C3=A4nzer?= michel.daen...@amd.com Date: Thu, 21 Feb 2013 16:12:45 +0100 Subject: [PATCH 1/6] R600/SI: Add intrinsics for texture sampling with user derivatives MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Michel Dänzer michel.daen...@amd.com --- lib/Target/R600/SIInstructions.td | 7 ++- lib/Target/R600/SIIntrinsics.td | 1 + 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/lib/Target/R600/SIInstructions.td b/lib/Target/R600/SIInstructions.td index 9c96c08..c9eac7d 100644 --- a/lib/Target/R600/SIInstructions.td +++ b/lib/Target/R600/SIInstructions.td @@ -535,7 +535,7 @@ def IMAGE_SAMPLE_B : MIMG_Sampler_Helper 0x0025, IMAGE_SAMPLE_B; //def IMAGE_SAMPLE_LZ : MIMG_NoPattern_ IMAGE_SAMPLE_LZ, 0x0027; def IMAGE_SAMPLE_C : MIMG_Sampler_Helper 0x0028, IMAGE_SAMPLE_C; //def IMAGE_SAMPLE_C_CL : MIMG_NoPattern_ IMAGE_SAMPLE_C_CL, 0x0029; -//def IMAGE_SAMPLE_C_D : MIMG_NoPattern_ IMAGE_SAMPLE_C_D, 0x002a; +def IMAGE_SAMPLE_C_D : MIMG_Sampler_Helper 0x002a, IMAGE_SAMPLE_C_D; //def IMAGE_SAMPLE_C_D_CL : MIMG_NoPattern_ IMAGE_SAMPLE_C_D_CL, 0x002b; def IMAGE_SAMPLE_C_L : MIMG_Sampler_Helper 0x002c, IMAGE_SAMPLE_C_L; def IMAGE_SAMPLE_C_B : MIMG_Sampler_Helper 0x002d, IMAGE_SAMPLE_C_B; @@ -1296,6 +1296,11 @@ multiclass SamplePatternsValueType addr_type { def : SampleArrayPattern int_SI_sampleb, IMAGE_SAMPLE_B, addr_type; def : SampleShadowPattern int_SI_sampleb, IMAGE_SAMPLE_C_B, addr_type; def : SampleShadowArrayPattern int_SI_sampleb, IMAGE_SAMPLE_C_B, addr_type; + + def : SamplePattern int_SI_sampled, IMAGE_SAMPLE_D, addr_type; + def : SampleArrayPattern int_SI_sampled, IMAGE_SAMPLE_D, addr_type; + def : SampleShadowPattern int_SI_sampled, IMAGE_SAMPLE_C_D, addr_type; + def : SampleShadowArrayPattern int_SI_sampled, IMAGE_SAMPLE_C_D, addr_type; } defm : SamplePatternsv2i32; diff --git a/lib/Target/R600/SIIntrinsics.td b/lib/Target/R600/SIIntrinsics.td index 224cd2f..d2643e0 100644 --- a/lib/Target/R600/SIIntrinsics.td +++ b/lib/Target/R600/SIIntrinsics.td @@ -23,6 +23,7 @@ let TargetPrefix = SI, isTarget = 1 in { def int_SI_sample : Sample; def int_SI_sampleb : Sample; + def int_SI_sampled : Sample; def int_SI_samplel : Sample; def int_SI_imageload : Intrinsic [llvm_v4i32_ty], [llvm_anyvector_ty, llvm_v32i8_ty, llvm_i32_ty], [IntrNoMem]; -- 1.8.3.1 From 7a0048bb2ab1b661831da2b764bf1a52f66bec15 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michel=20D=C3=A4nzer?= michel.daen...@amd.com Date: Thu, 21 Feb 2013 18:51:38 +0100 Subject: [PATCH v3 2/6] R600/SI: Initial support for LDS/GDS instructions MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Michel Dänzer michel.daen...@amd.com --- v3: Drop vdst operand from DS_Store_Helper class, and adapt SIInsertWaits::getHwCounts() to handle that. Unfortunately, this seems to mess up the asm string output somehow, not sure what's going on there. lib/Target/R600/SIInsertWaits.cpp | 2 ++ lib/Target/R600/SIInstrFormats.td | 24 lib/Target/R600/SIInstrInfo.td | 23 +++ lib/Target/R600/SIInstructions.td | 3 +++ lib/Target/R600/SILowerControlFlow.cpp | 16 5 files changed, 68 insertions(+) diff --git a/lib/Target/R600/SIInsertWaits.cpp b/lib/Target/R600/SIInsertWaits.cpp index c36e1dc..d31da45 100644 --- a/lib/Target/R600/SIInsertWaits.cpp +++ b/lib/Target/R600/SIInsertWaits.cpp @@ -134,6 +134,8 @@ Counters SIInsertWaits::getHwCounts(MachineInstr MI) { if (TSFlags SIInstrFlags::LGKM_CNT) { MachineOperand Op = MI.getOperand(0); +if (!Op.isReg()) + Op = MI.getOperand(1); assert(Op.isReg() First LGKM operand must be a register!); unsigned Reg = Op.getReg(); diff --git a/lib/Target/R600/SIInstrFormats.td b/lib/Target/R600/SIInstrFormats.td index 51f323d..434aa7e 100644 ---
[Mesa-dev] R600/SI: Support for local memory and derivatives
These patches implement enough of local memory support to allow radeonsi to use that for computing derivatives, as suggested by Tom. They also almost allow test/CodeGen/R600/local-memory.ll to generate code for SI. Right now it still fails because it tries to copy a VGPR to an SGPR, which is not possible. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer From f4ca359c4536aa53122b654196f2e007d50976f8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michel=20D=C3=A4nzer?= michel.daen...@amd.com Date: Thu, 21 Feb 2013 16:12:45 +0100 Subject: [PATCH 1/6] R600/SI: Add intrinsics for texture sampling with user derivatives MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Michel Dänzer michel.daen...@amd.com --- lib/Target/R600/SIInstructions.td | 7 ++- lib/Target/R600/SIIntrinsics.td | 1 + 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/lib/Target/R600/SIInstructions.td b/lib/Target/R600/SIInstructions.td index 9c96c08..c9eac7d 100644 --- a/lib/Target/R600/SIInstructions.td +++ b/lib/Target/R600/SIInstructions.td @@ -535,7 +535,7 @@ def IMAGE_SAMPLE_B : MIMG_Sampler_Helper 0x0025, IMAGE_SAMPLE_B; //def IMAGE_SAMPLE_LZ : MIMG_NoPattern_ IMAGE_SAMPLE_LZ, 0x0027; def IMAGE_SAMPLE_C : MIMG_Sampler_Helper 0x0028, IMAGE_SAMPLE_C; //def IMAGE_SAMPLE_C_CL : MIMG_NoPattern_ IMAGE_SAMPLE_C_CL, 0x0029; -//def IMAGE_SAMPLE_C_D : MIMG_NoPattern_ IMAGE_SAMPLE_C_D, 0x002a; +def IMAGE_SAMPLE_C_D : MIMG_Sampler_Helper 0x002a, IMAGE_SAMPLE_C_D; //def IMAGE_SAMPLE_C_D_CL : MIMG_NoPattern_ IMAGE_SAMPLE_C_D_CL, 0x002b; def IMAGE_SAMPLE_C_L : MIMG_Sampler_Helper 0x002c, IMAGE_SAMPLE_C_L; def IMAGE_SAMPLE_C_B : MIMG_Sampler_Helper 0x002d, IMAGE_SAMPLE_C_B; @@ -1296,6 +1296,11 @@ multiclass SamplePatternsValueType addr_type { def : SampleArrayPattern int_SI_sampleb, IMAGE_SAMPLE_B, addr_type; def : SampleShadowPattern int_SI_sampleb, IMAGE_SAMPLE_C_B, addr_type; def : SampleShadowArrayPattern int_SI_sampleb, IMAGE_SAMPLE_C_B, addr_type; + + def : SamplePattern int_SI_sampled, IMAGE_SAMPLE_D, addr_type; + def : SampleArrayPattern int_SI_sampled, IMAGE_SAMPLE_D, addr_type; + def : SampleShadowPattern int_SI_sampled, IMAGE_SAMPLE_C_D, addr_type; + def : SampleShadowArrayPattern int_SI_sampled, IMAGE_SAMPLE_C_D, addr_type; } defm : SamplePatternsv2i32; diff --git a/lib/Target/R600/SIIntrinsics.td b/lib/Target/R600/SIIntrinsics.td index 224cd2f..d2643e0 100644 --- a/lib/Target/R600/SIIntrinsics.td +++ b/lib/Target/R600/SIIntrinsics.td @@ -23,6 +23,7 @@ let TargetPrefix = SI, isTarget = 1 in { def int_SI_sample : Sample; def int_SI_sampleb : Sample; + def int_SI_sampled : Sample; def int_SI_samplel : Sample; def int_SI_imageload : Intrinsic [llvm_v4i32_ty], [llvm_anyvector_ty, llvm_v32i8_ty, llvm_i32_ty], [IntrNoMem]; -- 1.8.3.1 From 7a0048bb2ab1b661831da2b764bf1a52f66bec15 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michel=20D=C3=A4nzer?= michel.daen...@amd.com Date: Thu, 21 Feb 2013 18:51:38 +0100 Subject: [PATCH v3 2/6] R600/SI: Initial support for LDS/GDS instructions MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Michel Dänzer michel.daen...@amd.com --- v3: Drop vdst operand from DS_Store_Helper class, and adapt SIInsertWaits::getHwCounts() to handle that. Unfortunately, this seems to mess up the asm string output somehow, not sure what's going on there. lib/Target/R600/SIInsertWaits.cpp | 2 ++ lib/Target/R600/SIInstrFormats.td | 24 lib/Target/R600/SIInstrInfo.td | 23 +++ lib/Target/R600/SIInstructions.td | 3 +++ lib/Target/R600/SILowerControlFlow.cpp | 16 5 files changed, 68 insertions(+) diff --git a/lib/Target/R600/SIInsertWaits.cpp b/lib/Target/R600/SIInsertWaits.cpp index c36e1dc..d31da45 100644 --- a/lib/Target/R600/SIInsertWaits.cpp +++ b/lib/Target/R600/SIInsertWaits.cpp @@ -134,6 +134,8 @@ Counters SIInsertWaits::getHwCounts(MachineInstr MI) { if (TSFlags SIInstrFlags::LGKM_CNT) { MachineOperand Op = MI.getOperand(0); +if (!Op.isReg()) + Op = MI.getOperand(1); assert(Op.isReg() First LGKM operand must be a register!); unsigned Reg = Op.getReg(); diff --git a/lib/Target/R600/SIInstrFormats.td b/lib/Target/R600/SIInstrFormats.td index 51f323d..434aa7e 100644 --- a/lib/Target/R600/SIInstrFormats.td +++ b/lib/Target/R600/SIInstrFormats.td @@ -281,6 +281,30 @@ class VINTRP bits 2 op, dag outs, dag ins, string asm, listdag pattern : let Uses = [EXEC] in { +class DS bits8 op, dag outs, dag ins, string asm, listdag pattern : +Enc64 outs, ins, asm, pattern { + + bits8 vdst; + bits1 gds; + bits8 addr; + bits8 data0; + bits8 data1; + bits8 offset0; + bits8 offset1; + + let