When branching to loops to handle longer SVE vectors, the branch target was previously written incorrectly here as "vl_gt_16" causing an infinite loop. Fix this by adjusting the branch target to correctly refer to the "vl_gt_48" case instead.
Co-authored-by: Hari Limaye <[email protected]> --- source/common/aarch64/pixel-util-sve2.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/source/common/aarch64/pixel-util-sve2.S b/source/common/aarch64/pixel-util-sve2.S index 00aa2f984..b2b4d24c1 100644 --- a/source/common/aarch64/pixel-util-sve2.S +++ b/source/common/aarch64/pixel-util-sve2.S @@ -408,8 +408,8 @@ function PFX(pixel_sub_ps_64x64_sve2) ret .vl_gt_16_pixel_sub_ps_64x64: rdvl x9, #1 - cmp x9, #16 - bgt .vl_gt_16_pixel_sub_ps_64x64 + cmp x9, #32 + bgt .vl_gt_48_pixel_sub_ps_64x64 ptrue p0.b, vl32 mov w12, #16 .vl_gt_16_loop_sub_ps_64_sve2: -- 2.34.1
>From 80dd0d9827b39087d936d1f9c77b59f42c933a75 Mon Sep 17 00:00:00 2001 Message-Id: <80dd0d9827b39087d936d1f9c77b59f42c933a75.1736179734.git.george.st...@arm.com> In-Reply-To: <[email protected]> References: <[email protected]> From: George Steed <[email protected]> Date: Mon, 23 Dec 2024 14:14:12 +0000 Subject: [PATCH 3/6] pixel-util-sve2.S: Fix branch target in pixel_sub_ps_64x64_sve2 When branching to loops to handle longer SVE vectors, the branch target was previously written incorrectly here as "vl_gt_16" causing an infinite loop. Fix this by adjusting the branch target to correctly refer to the "vl_gt_48" case instead. Co-authored-by: Hari Limaye <[email protected]> --- source/common/aarch64/pixel-util-sve2.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/source/common/aarch64/pixel-util-sve2.S b/source/common/aarch64/pixel-util-sve2.S index 00aa2f984..b2b4d24c1 100644 --- a/source/common/aarch64/pixel-util-sve2.S +++ b/source/common/aarch64/pixel-util-sve2.S @@ -408,8 +408,8 @@ function PFX(pixel_sub_ps_64x64_sve2) ret .vl_gt_16_pixel_sub_ps_64x64: rdvl x9, #1 - cmp x9, #16 - bgt .vl_gt_16_pixel_sub_ps_64x64 + cmp x9, #32 + bgt .vl_gt_48_pixel_sub_ps_64x64 ptrue p0.b, vl32 mov w12, #16 .vl_gt_16_loop_sub_ps_64_sve2: -- 2.34.1
_______________________________________________ x265-devel mailing list [email protected] https://mailman.videolan.org/listinfo/x265-devel
