https://bugs.freedesktop.org/show_bug.cgi?id=106774
Bug ID: 106774
Summary: NIR "copy propagates" loads of SSBOs
Product: Mesa
Version: git
Hardware: Other
OS: All
Status: NEW
Severity: normal
Priority: medium
Component: glsl-compiler
Assignee: mesa-dev@lists.freedesktop.org
Reporter: i...@freedesktop.org
QA Contact: intel-3d-b...@lists.freedesktop.org
Created attachment 139956
--> https://bugs.freedesktop.org/attachment.cgi?id=139956&action=edit
Test case that exhibits the issue
If a value loaded from an SSBO is used multiple times, the ssbo_load gets copy
propagated to each use. SSBO loads are expensive, so, while this may be
technically correct, it is also bad.
Arguments can be made that the test case is not correct, but that is an
orthogonal issue. The core of the attached test case is:
layout(binding = 0) buffer bufblock {
int value;
};
...
int f;
/* This is an open-coded atomicAdd. */
do {
f = value;
} while (f != atomicCompSwap(value, f, f + 4));
For this code, the resulting NIR is (note the four load_ssbo operations):
loop {
block block_1:
/* preds: block_0 block_4 */
vec1 32 ssa_4 = load_const (0x00000001 /* 0.000000 */)
vec1 32 ssa_5 = iadd ssa_0, ssa_4
vec1 32 ssa_6 = intrinsic load_ssbo (ssa_5, ssa_0) () ()
vec1 32 ssa_7 = load_const (0x00000001 /* 0.000000 */)
vec1 32 ssa_8 = iadd ssa_0, ssa_7
vec1 32 ssa_9 = intrinsic load_ssbo (ssa_8, ssa_0) () ()
vec1 32 ssa_10 = load_const (0x00000001 /* 0.000000 */)
vec1 32 ssa_11 = iadd ssa_0, ssa_10
vec1 32 ssa_12 = intrinsic load_ssbo (ssa_11, ssa_0) () ()
vec1 32 ssa_13 = iadd ssa_12, ssa_1
vec1 32 ssa_14 = load_const (0x00000001 /* 0.000000 */)
vec1 32 ssa_15 = iadd ssa_0, ssa_14
vec1 32 ssa_16 = intrinsic ssbo_atomic_comp_swap (ssa_15,
ssa_0, ssa_9, ssa_13) () ()
vec1 32 ssa_17 = load_const (0x00000001 /* 0.000000 */)
vec1 32 ssa_18 = iadd ssa_0, ssa_17
vec1 32 ssa_19 = intrinsic load_ssbo (ssa_18, ssa_0) () ()
vec1 32 ssa_20 = ieq ssa_19, ssa_16
/* succs: block_2 block_3 */
if ssa_20 {
block block_2:
/* preds: block_1 */
break
/* succs: block_5 */
} else {
block block_3:
/* preds: block_1 */
/* succs: block_4 */
}
block block_4:
/* preds: block_3 */
/* succs: block_1 */
}
The multiple reads get a mishmosh of values (written by other threads), and
Annotating 'value' with volatile helps, but the generated code still wrong.
There should definitely *NOT* be a load_ssbo after the ssbo_atomic_comp_swap!
loop {
block block_1:
/* preds: block_0 block_4 */
vec1 32 ssa_4 = intrinsic load_ssbo (ssa_2, ssa_0) () ()
vec1 32 ssa_5 = intrinsic load_ssbo (ssa_2, ssa_0) () ()
vec1 32 ssa_6 = intrinsic load_ssbo (ssa_2, ssa_0) () ()
vec1 32 ssa_7 = iadd ssa_6, ssa_1
vec1 32 ssa_8 = intrinsic ssbo_atomic_comp_swap (ssa_2, ssa_0,
ssa_5, ssa_7) () ()
vec1 32 ssa_9 = intrinsic load_ssbo (ssa_2, ssa_0) () ()
vec1 32 ssa_10 = ieq ssa_9, ssa_8
/* succs: block_2 block_3 */
if ssa_10 {
block block_2:
/* preds: block_1 */
break
/* succs: block_5 */
} else {
block block_3:
/* preds: block_1 */
/* succs: block_4 */
}
block block_4:
/* preds: block_3 */
/* succs: block_1 */
}
Completely re-writing the loop as
int f = value;
int a;
do {
a = f;
} while ((f = atomicCompSwap(value, f, f + 4)) != a);
produces the expected results:
vec1 32 ssa_4 = intrinsic load_ssbo (ssa_2, ssa_0) () ()
/* succs: block_1 */
loop {
block block_1:
/* preds: block_0 block_4 */
vec1 32 ssa_5 = phi block_0: ssa_4, block_4: ssa_7
vec1 32 ssa_6 = iadd ssa_5, ssa_1
vec1 32 ssa_7 = intrinsic ssbo_atomic_comp_swap (ssa_2, ssa_0,
ssa_5, ssa_6) () ()
vec1 32 ssa_8 = ieq ssa_7, ssa_5
/* succs: block_2 block_3 */
if ssa_8 {
block block_2:
/* preds: block_1 */
break
/* succs: block_5 */
} else {
block block_3:
/* preds: block_1 */
/* succs: block_4 */
}
block block_4:
/* preds: block_3 */
/* succs: block_1 */
}
--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev