Re: [PATCH 1/2]AArch64: make aarch64_simd_vec_unpack_lo_/_hi_ consistent.

2024-07-04 Thread Richard Sandiford
Tamar Christina writes: > Hi All, > > The fix for PR18127 reworked the uxtl to zip optimization. > In doing so it undid the changes in aarch64_simd_vec_unpack_lo_ and this > now > no longer matches aarch64_simd_vec_unpack_hi_. It still works because the > RTL generated by aarch64_simd_vec_unpack

RE: [PATCH 1/2]AArch64: make aarch64_simd_vec_unpack_lo_/_hi_ consistent.

2024-07-04 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, July 4, 2024 12:46 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH 1/2]AArch64: make aarch64_

Re: [PATCH 1/2]AArch64: make aarch64_simd_vec_unpack_lo_/_hi_ consistent.

2024-07-04 Thread Richard Sandiford
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Thursday, July 4, 2024 12:46 PM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; Marcus Shawcroft >> ; ktkac...@gcc.gnu.or

Re: [PATCH 1/2]AArch64: make aarch64_simd_vec_unpack_lo_/_hi_ consistent.

2024-07-04 Thread Richard Sandiford
; Marcus Shawcroft >>> ; ktkac...@gcc.gnu.org >>> Subject: Re: [PATCH 1/2]AArch64: make aarch64_simd_vec_unpack_lo_/_hi_ >>> consistent. >>> >>> Tamar Christina writes: >>> > Hi All, >>> > >>> > The fix for PR18127 rew

RE: [PATCH 1/2]AArch64: make aarch64_simd_vec_unpack_lo_/_hi_ consistent.

2024-07-05 Thread Tamar Christina
> > The principle is that, say: > > > > (vec_select:V2SI (reg:V2DI R) (parallel [(const_int 0) (const_int 1)])) > > > > is (for little-endian) equivalent to: > > > > (subreg:V2SI (reg:V2DI R) 0) > > Sigh, of course I meant V4SI rather than V2DI in the above :) > > > and similarly for the equi

Re: [PATCH 1/2]AArch64: make aarch64_simd_vec_unpack_lo_/_hi_ consistent.

2024-07-05 Thread Richard Sandiford
Tamar Christina writes: >> > The principle is that, say: >> > >> > (vec_select:V2SI (reg:V2DI R) (parallel [(const_int 0) (const_int 1)])) >> > >> > is (for little-endian) equivalent to: >> > >> > (subreg:V2SI (reg:V2DI R) 0) >> >> Sigh, of course I meant V4SI rather than V2DI in the above :)