When branching to loops to handle longer SVE vectors, the branch target
was previously written incorrectly here as "vl_gt_16" causing an
infinite loop. Fix this by adjusting the branch target to correctly
refer to the "vl_gt_48" case instead.

Co-authored-by: Hari Limaye <[email protected]>
---
 source/common/aarch64/pixel-util-sve2.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/source/common/aarch64/pixel-util-sve2.S 
b/source/common/aarch64/pixel-util-sve2.S
index 00aa2f984..b2b4d24c1 100644
--- a/source/common/aarch64/pixel-util-sve2.S
+++ b/source/common/aarch64/pixel-util-sve2.S
@@ -408,8 +408,8 @@ function PFX(pixel_sub_ps_64x64_sve2)
     ret
 .vl_gt_16_pixel_sub_ps_64x64:
     rdvl            x9, #1
-    cmp             x9, #16
-    bgt             .vl_gt_16_pixel_sub_ps_64x64
+    cmp             x9, #32
+    bgt             .vl_gt_48_pixel_sub_ps_64x64
     ptrue           p0.b, vl32
     mov             w12, #16
 .vl_gt_16_loop_sub_ps_64_sve2:
-- 
2.34.1

>From 80dd0d9827b39087d936d1f9c77b59f42c933a75 Mon Sep 17 00:00:00 2001
Message-Id: <80dd0d9827b39087d936d1f9c77b59f42c933a75.1736179734.git.george.st...@arm.com>
In-Reply-To: <[email protected]>
References: <[email protected]>
From: George Steed <[email protected]>
Date: Mon, 23 Dec 2024 14:14:12 +0000
Subject: [PATCH 3/6] pixel-util-sve2.S: Fix branch target in
 pixel_sub_ps_64x64_sve2

When branching to loops to handle longer SVE vectors, the branch target
was previously written incorrectly here as "vl_gt_16" causing an
infinite loop. Fix this by adjusting the branch target to correctly
refer to the "vl_gt_48" case instead.

Co-authored-by: Hari Limaye <[email protected]>
---
 source/common/aarch64/pixel-util-sve2.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/source/common/aarch64/pixel-util-sve2.S b/source/common/aarch64/pixel-util-sve2.S
index 00aa2f984..b2b4d24c1 100644
--- a/source/common/aarch64/pixel-util-sve2.S
+++ b/source/common/aarch64/pixel-util-sve2.S
@@ -408,8 +408,8 @@ function PFX(pixel_sub_ps_64x64_sve2)
     ret
 .vl_gt_16_pixel_sub_ps_64x64:
     rdvl            x9, #1
-    cmp             x9, #16
-    bgt             .vl_gt_16_pixel_sub_ps_64x64
+    cmp             x9, #32
+    bgt             .vl_gt_48_pixel_sub_ps_64x64
     ptrue           p0.b, vl32
     mov             w12, #16
 .vl_gt_16_loop_sub_ps_64_sve2:
-- 
2.34.1

_______________________________________________
x265-devel mailing list
[email protected]
https://mailman.videolan.org/listinfo/x265-devel

Reply via email to