[llvm-branch-commits] [llvm] [AMDGPU] Fix gfx12 waitcnt type for image_msaa_load (#90201) (PR #90582)

2024-05-01 Thread Jay Foad via llvm-branch-commits

https://github.com/jayfoad updated 
https://github.com/llvm/llvm-project/pull/90582

>From 17b75a9517891d662e677a357713c920bb79c43c Mon Sep 17 00:00:00 2001
From: David Stuttard 
Date: Tue, 30 Apr 2024 10:41:51 +0100
Subject: [PATCH 1/2] [AMDGPU] Fix gfx12 waitcnt type for image_msaa_load
 (#90201)

image_msaa_load is actually encoded as a VSAMPLE instruction and
requires the appropriate waitcnt variant.
---
 llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp   |  8 --
 .../AMDGPU/llvm.amdgcn.image.msaa.load.ll | 26 +--
 2 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp 
b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
index 6ecb1c8bf6e1db..97c55e4d9e41c2 100644
--- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -187,8 +187,12 @@ VmemType getVmemType(const MachineInstr &Inst) {
   const AMDGPU::MIMGInfo *Info = AMDGPU::getMIMGInfo(Inst.getOpcode());
   const AMDGPU::MIMGBaseOpcodeInfo *BaseInfo =
   AMDGPU::getMIMGBaseOpcodeInfo(Info->BaseOpcode);
-  return BaseInfo->BVH ? VMEM_BVH
-   : BaseInfo->Sampler ? VMEM_SAMPLER : VMEM_NOSAMPLER;
+  // The test for MSAA here is because gfx12+ image_msaa_load is actually
+  // encoded as VSAMPLE and requires the appropriate s_waitcnt variant for 
that.
+  // Pre-gfx12 doesn't care since all vmem types result in the same s_waitcnt.
+  return BaseInfo->BVH ? VMEM_BVH
+ : BaseInfo->Sampler || BaseInfo->MSAA ? VMEM_SAMPLER
+   : VMEM_NOSAMPLER;
 }
 
 unsigned &getCounterRef(AMDGPU::Waitcnt &Wait, InstCounterType T) {
diff --git a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.msaa.load.ll 
b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.msaa.load.ll
index 1348315e72e7bc..8da48551855570 100644
--- a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.msaa.load.ll
+++ b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.msaa.load.ll
@@ -12,7 +12,7 @@ define amdgpu_ps <4 x float> @load_2dmsaa(<8 x i32> inreg 
%rsrc, i32 %s, i32 %t,
 ; GFX12-LABEL: load_2dmsaa:
 ; GFX12:   ; %bb.0: ; %main_body
 ; GFX12-NEXT:image_msaa_load v[0:3], [v0, v1, v2], s[0:7] dmask:0x1 
dim:SQ_RSRC_IMG_2D_MSAA unorm ; encoding: 
[0x06,0x20,0x46,0xe4,0x00,0x00,0x00,0x00,0x00,0x01,0x02,0x00]
-; GFX12-NEXT:s_wait_loadcnt 0x0 ; encoding: [0x00,0x00,0xc0,0xbf]
+; GFX12-NEXT:s_wait_samplecnt 0x0 ; encoding: [0x00,0x00,0xc2,0xbf]
 ; GFX12-NEXT:; return to shader part epilog
 main_body:
   %v = call <4 x float> @llvm.amdgcn.image.msaa.load.2dmsaa.v4f32.i32(i32 1, 
i32 %s, i32 %t, i32 %fragid, <8 x i32> %rsrc, i32 0, i32 0)
@@ -32,7 +32,7 @@ define amdgpu_ps <4 x float> @load_2dmsaa_both(<8 x i32> 
inreg %rsrc, ptr addrsp
 ; GFX12:   ; %bb.0: ; %main_body
 ; GFX12-NEXT:image_msaa_load v[0:4], [v0, v1, v2], s[0:7] dmask:0x2 
dim:SQ_RSRC_IMG_2D_MSAA unorm tfe lwe ; encoding: 
[0x0e,0x20,0x86,0xe4,0x00,0x01,0x00,0x00,0x00,0x01,0x02,0x00]
 ; GFX12-NEXT:v_mov_b32_e32 v5, 0 ; encoding: [0x80,0x02,0x0a,0x7e]
-; GFX12-NEXT:s_wait_loadcnt 0x0 ; encoding: [0x00,0x00,0xc0,0xbf]
+; GFX12-NEXT:s_wait_samplecnt 0x0 ; encoding: [0x00,0x00,0xc2,0xbf]
 ; GFX12-NEXT:global_store_b32 v5, v4, s[8:9] ; encoding: 
[0x08,0x80,0x06,0xee,0x00,0x00,0x00,0x02,0x05,0x00,0x00,0x00]
 ; GFX12-NEXT:; return to shader part epilog
 main_body:
@@ -53,7 +53,7 @@ define amdgpu_ps <4 x float> @load_2darraymsaa(<8 x i32> 
inreg %rsrc, i32 %s, i3
 ; GFX12-LABEL: load_2darraymsaa:
 ; GFX12:   ; %bb.0: ; %main_body
 ; GFX12-NEXT:image_msaa_load v[0:3], [v0, v1, v2, v3], s[0:7] dmask:0x4 
dim:SQ_RSRC_IMG_2D_MSAA_ARRAY unorm ; encoding: 
[0x07,0x20,0x06,0xe5,0x00,0x00,0x00,0x00,0x00,0x01,0x02,0x03]
-; GFX12-NEXT:s_wait_loadcnt 0x0 ; encoding: [0x00,0x00,0xc0,0xbf]
+; GFX12-NEXT:s_wait_samplecnt 0x0 ; encoding: [0x00,0x00,0xc2,0xbf]
 ; GFX12-NEXT:; return to shader part epilog
 main_body:
   %v = call <4 x float> @llvm.amdgcn.image.msaa.load.2darraymsaa.v4f32.i32(i32 
4, i32 %s, i32 %t, i32 %slice, i32 %fragid, <8 x i32> %rsrc, i32 0, i32 0)
@@ -73,7 +73,7 @@ define amdgpu_ps <4 x float> @load_2darraymsaa_tfe(<8 x i32> 
inreg %rsrc, ptr ad
 ; GFX12:   ; %bb.0: ; %main_body
 ; GFX12-NEXT:image_msaa_load v[0:4], [v0, v1, v2, v3], s[0:7] dmask:0x8 
dim:SQ_RSRC_IMG_2D_MSAA_ARRAY unorm tfe ; encoding: 
[0x0f,0x20,0x06,0xe6,0x00,0x00,0x00,0x00,0x00,0x01,0x02,0x03]
 ; GFX12-NEXT:v_mov_b32_e32 v5, 0 ; encoding: [0x80,0x02,0x0a,0x7e]
-; GFX12-NEXT:s_wait_loadcnt 0x0 ; encoding: [0x00,0x00,0xc0,0xbf]
+; GFX12-NEXT:s_wait_samplecnt 0x0 ; encoding: [0x00,0x00,0xc2,0xbf]
 ; GFX12-NEXT:global_store_b32 v5, v4, s[8:9] ; encoding: 
[0x08,0x80,0x06,0xee,0x00,0x00,0x00,0x02,0x05,0x00,0x00,0x00]
 ; GFX12-NEXT:; return to shader part epilog
 main_body:
@@ -94,7 +94,7 @@ define amdgpu_ps <4 x float> @load_2dmsaa_glc(<8 x i32> inreg 
%rsrc, i32 %s, i32
 ; GFX12-LABEL: load_2dmsaa

[llvm-branch-commits] [llvm] [AMDGPU] Fix gfx12 waitcnt type for image_msaa_load (#90201) (PR #90582)

2024-05-01 Thread Jay Foad via llvm-branch-commits

https://github.com/jayfoad ready_for_review 
https://github.com/llvm/llvm-project/pull/90582
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Fix gfx12 waitcnt type for image_msaa_load (#90201) (PR #90582)

2024-05-01 Thread Jay Foad via llvm-branch-commits

jayfoad wrote:

> Let's not backport this yet since @pendingchaos has pointed out a problem 
> with #90201.

Fixed by #90710 which I have added to this PR.

https://github.com/llvm/llvm-project/pull/90582
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Fix gfx12 waitcnt type for image_msaa_load (#90201) (PR #90582)

2024-05-02 Thread Matt Arsenault via llvm-branch-commits


@@ -187,8 +187,12 @@ VmemType getVmemType(const MachineInstr &Inst) {
   const AMDGPU::MIMGInfo *Info = AMDGPU::getMIMGInfo(Inst.getOpcode());
   const AMDGPU::MIMGBaseOpcodeInfo *BaseInfo =
   AMDGPU::getMIMGBaseOpcodeInfo(Info->BaseOpcode);
-  return BaseInfo->BVH ? VMEM_BVH
-   : BaseInfo->Sampler ? VMEM_SAMPLER : VMEM_NOSAMPLER;
+  // We have to make an additional check for isVSAMPLE here since some
+  // instructions don't have a sampler, but are still classified as sampler
+  // instructions for the purposes of e.g. waitcnt.
+  return BaseInfo->BVH ? VMEM_BVH
+ : (BaseInfo->Sampler || SIInstrInfo::isVSAMPLE(Inst)) ? VMEM_SAMPLER
+   : 
VMEM_NOSAMPLER;

arsenm wrote:

Use a temporary variable? This nested ternary formatting is breaking my brain 

https://github.com/llvm/llvm-project/pull/90582
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Fix gfx12 waitcnt type for image_msaa_load (#90201) (PR #90582)

2024-05-31 Thread Jay Foad via llvm-branch-commits

https://github.com/jayfoad closed 
https://github.com/llvm/llvm-project/pull/90582
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Fix gfx12 waitcnt type for image_msaa_load (#90201) (PR #90582)

2024-05-31 Thread Jay Foad via llvm-branch-commits

jayfoad wrote:

Too late to backport - no more 18.x releases are planned.

https://github.com/llvm/llvm-project/pull/90582
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Fix gfx12 waitcnt type for image_msaa_load (#90201) (PR #90582)

2024-04-30 Thread Jay Foad via llvm-branch-commits

https://github.com/jayfoad created 
https://github.com/llvm/llvm-project/pull/90582

image_msaa_load is actually encoded as a VSAMPLE instruction and
requires the appropriate waitcnt variant.


>From 17b75a9517891d662e677a357713c920bb79c43c Mon Sep 17 00:00:00 2001
From: David Stuttard 
Date: Tue, 30 Apr 2024 10:41:51 +0100
Subject: [PATCH] [AMDGPU] Fix gfx12 waitcnt type for image_msaa_load (#90201)

image_msaa_load is actually encoded as a VSAMPLE instruction and
requires the appropriate waitcnt variant.
---
 llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp   |  8 --
 .../AMDGPU/llvm.amdgcn.image.msaa.load.ll | 26 +--
 2 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp 
b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
index 6ecb1c8bf6e1db..97c55e4d9e41c2 100644
--- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -187,8 +187,12 @@ VmemType getVmemType(const MachineInstr &Inst) {
   const AMDGPU::MIMGInfo *Info = AMDGPU::getMIMGInfo(Inst.getOpcode());
   const AMDGPU::MIMGBaseOpcodeInfo *BaseInfo =
   AMDGPU::getMIMGBaseOpcodeInfo(Info->BaseOpcode);
-  return BaseInfo->BVH ? VMEM_BVH
-   : BaseInfo->Sampler ? VMEM_SAMPLER : VMEM_NOSAMPLER;
+  // The test for MSAA here is because gfx12+ image_msaa_load is actually
+  // encoded as VSAMPLE and requires the appropriate s_waitcnt variant for 
that.
+  // Pre-gfx12 doesn't care since all vmem types result in the same s_waitcnt.
+  return BaseInfo->BVH ? VMEM_BVH
+ : BaseInfo->Sampler || BaseInfo->MSAA ? VMEM_SAMPLER
+   : VMEM_NOSAMPLER;
 }
 
 unsigned &getCounterRef(AMDGPU::Waitcnt &Wait, InstCounterType T) {
diff --git a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.msaa.load.ll 
b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.msaa.load.ll
index 1348315e72e7bc..8da48551855570 100644
--- a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.msaa.load.ll
+++ b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.msaa.load.ll
@@ -12,7 +12,7 @@ define amdgpu_ps <4 x float> @load_2dmsaa(<8 x i32> inreg 
%rsrc, i32 %s, i32 %t,
 ; GFX12-LABEL: load_2dmsaa:
 ; GFX12:   ; %bb.0: ; %main_body
 ; GFX12-NEXT:image_msaa_load v[0:3], [v0, v1, v2], s[0:7] dmask:0x1 
dim:SQ_RSRC_IMG_2D_MSAA unorm ; encoding: 
[0x06,0x20,0x46,0xe4,0x00,0x00,0x00,0x00,0x00,0x01,0x02,0x00]
-; GFX12-NEXT:s_wait_loadcnt 0x0 ; encoding: [0x00,0x00,0xc0,0xbf]
+; GFX12-NEXT:s_wait_samplecnt 0x0 ; encoding: [0x00,0x00,0xc2,0xbf]
 ; GFX12-NEXT:; return to shader part epilog
 main_body:
   %v = call <4 x float> @llvm.amdgcn.image.msaa.load.2dmsaa.v4f32.i32(i32 1, 
i32 %s, i32 %t, i32 %fragid, <8 x i32> %rsrc, i32 0, i32 0)
@@ -32,7 +32,7 @@ define amdgpu_ps <4 x float> @load_2dmsaa_both(<8 x i32> 
inreg %rsrc, ptr addrsp
 ; GFX12:   ; %bb.0: ; %main_body
 ; GFX12-NEXT:image_msaa_load v[0:4], [v0, v1, v2], s[0:7] dmask:0x2 
dim:SQ_RSRC_IMG_2D_MSAA unorm tfe lwe ; encoding: 
[0x0e,0x20,0x86,0xe4,0x00,0x01,0x00,0x00,0x00,0x01,0x02,0x00]
 ; GFX12-NEXT:v_mov_b32_e32 v5, 0 ; encoding: [0x80,0x02,0x0a,0x7e]
-; GFX12-NEXT:s_wait_loadcnt 0x0 ; encoding: [0x00,0x00,0xc0,0xbf]
+; GFX12-NEXT:s_wait_samplecnt 0x0 ; encoding: [0x00,0x00,0xc2,0xbf]
 ; GFX12-NEXT:global_store_b32 v5, v4, s[8:9] ; encoding: 
[0x08,0x80,0x06,0xee,0x00,0x00,0x00,0x02,0x05,0x00,0x00,0x00]
 ; GFX12-NEXT:; return to shader part epilog
 main_body:
@@ -53,7 +53,7 @@ define amdgpu_ps <4 x float> @load_2darraymsaa(<8 x i32> 
inreg %rsrc, i32 %s, i3
 ; GFX12-LABEL: load_2darraymsaa:
 ; GFX12:   ; %bb.0: ; %main_body
 ; GFX12-NEXT:image_msaa_load v[0:3], [v0, v1, v2, v3], s[0:7] dmask:0x4 
dim:SQ_RSRC_IMG_2D_MSAA_ARRAY unorm ; encoding: 
[0x07,0x20,0x06,0xe5,0x00,0x00,0x00,0x00,0x00,0x01,0x02,0x03]
-; GFX12-NEXT:s_wait_loadcnt 0x0 ; encoding: [0x00,0x00,0xc0,0xbf]
+; GFX12-NEXT:s_wait_samplecnt 0x0 ; encoding: [0x00,0x00,0xc2,0xbf]
 ; GFX12-NEXT:; return to shader part epilog
 main_body:
   %v = call <4 x float> @llvm.amdgcn.image.msaa.load.2darraymsaa.v4f32.i32(i32 
4, i32 %s, i32 %t, i32 %slice, i32 %fragid, <8 x i32> %rsrc, i32 0, i32 0)
@@ -73,7 +73,7 @@ define amdgpu_ps <4 x float> @load_2darraymsaa_tfe(<8 x i32> 
inreg %rsrc, ptr ad
 ; GFX12:   ; %bb.0: ; %main_body
 ; GFX12-NEXT:image_msaa_load v[0:4], [v0, v1, v2, v3], s[0:7] dmask:0x8 
dim:SQ_RSRC_IMG_2D_MSAA_ARRAY unorm tfe ; encoding: 
[0x0f,0x20,0x06,0xe6,0x00,0x00,0x00,0x00,0x00,0x01,0x02,0x03]
 ; GFX12-NEXT:v_mov_b32_e32 v5, 0 ; encoding: [0x80,0x02,0x0a,0x7e]
-; GFX12-NEXT:s_wait_loadcnt 0x0 ; encoding: [0x00,0x00,0xc0,0xbf]
+; GFX12-NEXT:s_wait_samplecnt 0x0 ; encoding: [0x00,0x00,0xc2,0xbf]
 ; GFX12-NEXT:global_store_b32 v5, v4, s[8:9] ; encoding: 
[0x08,0x80,0x06,0xee,0x00,0x00,0x00,0x02,0x05,0x00,0x00,0x00]
 ; GFX12-NEXT:; return to shader part epilog
 main_body:
@@ -94,7 +94,7 @@ defin

[llvm-branch-commits] [llvm] [AMDGPU] Fix gfx12 waitcnt type for image_msaa_load (#90201) (PR #90582)

2024-04-30 Thread Jay Foad via llvm-branch-commits

https://github.com/jayfoad milestoned 
https://github.com/llvm/llvm-project/pull/90582
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Fix gfx12 waitcnt type for image_msaa_load (#90201) (PR #90582)

2024-04-30 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: Jay Foad (jayfoad)


Changes

image_msaa_load is actually encoded as a VSAMPLE instruction and
requires the appropriate waitcnt variant.


---
Full diff: https://github.com/llvm/llvm-project/pull/90582.diff


2 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp (+6-2) 
- (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.msaa.load.ll (+13-13) 


``diff
diff --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp 
b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
index 6ecb1c8bf6e1db..97c55e4d9e41c2 100644
--- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -187,8 +187,12 @@ VmemType getVmemType(const MachineInstr &Inst) {
   const AMDGPU::MIMGInfo *Info = AMDGPU::getMIMGInfo(Inst.getOpcode());
   const AMDGPU::MIMGBaseOpcodeInfo *BaseInfo =
   AMDGPU::getMIMGBaseOpcodeInfo(Info->BaseOpcode);
-  return BaseInfo->BVH ? VMEM_BVH
-   : BaseInfo->Sampler ? VMEM_SAMPLER : VMEM_NOSAMPLER;
+  // The test for MSAA here is because gfx12+ image_msaa_load is actually
+  // encoded as VSAMPLE and requires the appropriate s_waitcnt variant for 
that.
+  // Pre-gfx12 doesn't care since all vmem types result in the same s_waitcnt.
+  return BaseInfo->BVH ? VMEM_BVH
+ : BaseInfo->Sampler || BaseInfo->MSAA ? VMEM_SAMPLER
+   : VMEM_NOSAMPLER;
 }
 
 unsigned &getCounterRef(AMDGPU::Waitcnt &Wait, InstCounterType T) {
diff --git a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.msaa.load.ll 
b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.msaa.load.ll
index 1348315e72e7bc..8da48551855570 100644
--- a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.msaa.load.ll
+++ b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.msaa.load.ll
@@ -12,7 +12,7 @@ define amdgpu_ps <4 x float> @load_2dmsaa(<8 x i32> inreg 
%rsrc, i32 %s, i32 %t,
 ; GFX12-LABEL: load_2dmsaa:
 ; GFX12:   ; %bb.0: ; %main_body
 ; GFX12-NEXT:image_msaa_load v[0:3], [v0, v1, v2], s[0:7] dmask:0x1 
dim:SQ_RSRC_IMG_2D_MSAA unorm ; encoding: 
[0x06,0x20,0x46,0xe4,0x00,0x00,0x00,0x00,0x00,0x01,0x02,0x00]
-; GFX12-NEXT:s_wait_loadcnt 0x0 ; encoding: [0x00,0x00,0xc0,0xbf]
+; GFX12-NEXT:s_wait_samplecnt 0x0 ; encoding: [0x00,0x00,0xc2,0xbf]
 ; GFX12-NEXT:; return to shader part epilog
 main_body:
   %v = call <4 x float> @llvm.amdgcn.image.msaa.load.2dmsaa.v4f32.i32(i32 1, 
i32 %s, i32 %t, i32 %fragid, <8 x i32> %rsrc, i32 0, i32 0)
@@ -32,7 +32,7 @@ define amdgpu_ps <4 x float> @load_2dmsaa_both(<8 x i32> 
inreg %rsrc, ptr addrsp
 ; GFX12:   ; %bb.0: ; %main_body
 ; GFX12-NEXT:image_msaa_load v[0:4], [v0, v1, v2], s[0:7] dmask:0x2 
dim:SQ_RSRC_IMG_2D_MSAA unorm tfe lwe ; encoding: 
[0x0e,0x20,0x86,0xe4,0x00,0x01,0x00,0x00,0x00,0x01,0x02,0x00]
 ; GFX12-NEXT:v_mov_b32_e32 v5, 0 ; encoding: [0x80,0x02,0x0a,0x7e]
-; GFX12-NEXT:s_wait_loadcnt 0x0 ; encoding: [0x00,0x00,0xc0,0xbf]
+; GFX12-NEXT:s_wait_samplecnt 0x0 ; encoding: [0x00,0x00,0xc2,0xbf]
 ; GFX12-NEXT:global_store_b32 v5, v4, s[8:9] ; encoding: 
[0x08,0x80,0x06,0xee,0x00,0x00,0x00,0x02,0x05,0x00,0x00,0x00]
 ; GFX12-NEXT:; return to shader part epilog
 main_body:
@@ -53,7 +53,7 @@ define amdgpu_ps <4 x float> @load_2darraymsaa(<8 x i32> 
inreg %rsrc, i32 %s, i3
 ; GFX12-LABEL: load_2darraymsaa:
 ; GFX12:   ; %bb.0: ; %main_body
 ; GFX12-NEXT:image_msaa_load v[0:3], [v0, v1, v2, v3], s[0:7] dmask:0x4 
dim:SQ_RSRC_IMG_2D_MSAA_ARRAY unorm ; encoding: 
[0x07,0x20,0x06,0xe5,0x00,0x00,0x00,0x00,0x00,0x01,0x02,0x03]
-; GFX12-NEXT:s_wait_loadcnt 0x0 ; encoding: [0x00,0x00,0xc0,0xbf]
+; GFX12-NEXT:s_wait_samplecnt 0x0 ; encoding: [0x00,0x00,0xc2,0xbf]
 ; GFX12-NEXT:; return to shader part epilog
 main_body:
   %v = call <4 x float> @llvm.amdgcn.image.msaa.load.2darraymsaa.v4f32.i32(i32 
4, i32 %s, i32 %t, i32 %slice, i32 %fragid, <8 x i32> %rsrc, i32 0, i32 0)
@@ -73,7 +73,7 @@ define amdgpu_ps <4 x float> @load_2darraymsaa_tfe(<8 x i32> 
inreg %rsrc, ptr ad
 ; GFX12:   ; %bb.0: ; %main_body
 ; GFX12-NEXT:image_msaa_load v[0:4], [v0, v1, v2, v3], s[0:7] dmask:0x8 
dim:SQ_RSRC_IMG_2D_MSAA_ARRAY unorm tfe ; encoding: 
[0x0f,0x20,0x06,0xe6,0x00,0x00,0x00,0x00,0x00,0x01,0x02,0x03]
 ; GFX12-NEXT:v_mov_b32_e32 v5, 0 ; encoding: [0x80,0x02,0x0a,0x7e]
-; GFX12-NEXT:s_wait_loadcnt 0x0 ; encoding: [0x00,0x00,0xc0,0xbf]
+; GFX12-NEXT:s_wait_samplecnt 0x0 ; encoding: [0x00,0x00,0xc2,0xbf]
 ; GFX12-NEXT:global_store_b32 v5, v4, s[8:9] ; encoding: 
[0x08,0x80,0x06,0xee,0x00,0x00,0x00,0x02,0x05,0x00,0x00,0x00]
 ; GFX12-NEXT:; return to shader part epilog
 main_body:
@@ -94,7 +94,7 @@ define amdgpu_ps <4 x float> @load_2dmsaa_glc(<8 x i32> inreg 
%rsrc, i32 %s, i32
 ; GFX12-LABEL: load_2dmsaa_glc:
 ; GFX12:   ; %bb.0: ; %main_body
 ; GFX12-NEXT:image_msaa_load v[0:3], [v0, v1, v2], s[0:7] dmask:0x1 
dim:SQ_RSRC_IMG_2D_MSAA u

[llvm-branch-commits] [llvm] [AMDGPU] Fix gfx12 waitcnt type for image_msaa_load (#90201) (PR #90582)

2024-04-30 Thread Jay Foad via llvm-branch-commits

jayfoad wrote:

Let's not backport this yet since @pendingchaos has pointed out a problem with 
#90201.

https://github.com/llvm/llvm-project/pull/90582
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Fix gfx12 waitcnt type for image_msaa_load (#90201) (PR #90582)

2024-04-30 Thread Jay Foad via llvm-branch-commits

https://github.com/jayfoad converted_to_draft 
https://github.com/llvm/llvm-project/pull/90582
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits