Hi,
On 2023/6/7 10:31, Lulu Cheng wrote:
If the $ra register is modified during the jump to the jump table, the hardware
branch prediction function will be broken, resulting in a significant increase
in the branch false prediction rate and affecting performance.
Thanks for the insight! This is the kind of improvement that will
probably become a lot harder to even *sight* without uarch details
available.
However, I think it's better to also include a minimized test case to
ensure the compiled code doesn't regress. (Comparison of relevant
statistics, e.g. output of perf stat, would be even nicer to have!)
gcc/ChangeLog:
* config/loongarch/loongarch.md: Change register constraint to 'q'.
---
gcc/config/loongarch/loongarch.md | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/gcc/config/loongarch/loongarch.md
b/gcc/config/loongarch/loongarch.md
index 816a943d155..f9b64173104 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -2926,9 +2926,11 @@ (define_expand "tablejump"
DONE;
})
+;; Jump to the jump table Avoid using the $r1 register to prevent
+;; affecting hardware branch prediction.
(define_insn "@tablejump<mode>"
[(set (pc)
- (match_operand:P 0 "register_operand" "r"))
+ (match_operand:P 0 "register_operand" "q"))
(use (label_ref (match_operand 1 "" "")))]
""
"jr\t%0"