Re: [PATCH 5/6] mips: Implement vec_perm_const.
Richard Henderson writes: > On 12/11/2011 04:50 AM, Richard Sandiford wrote: >> [Mingjie, please could you help with the Loongson question near the end?] > > Actually, can you tell me how to test these abi combinations? I keep > trying to use mips-sim or mips64-sim and get linker errors complaining > of abi combinations. I tend to use mips64{,el}-linux-gnu with a hacked-up QEMU (hacked up to add MIPS16 to the cpu model, which isn't relevant here). But I'm surprised *-elf is causing problems. Something like mipsisa64-elfoabi ought to just work (I last tested that a few weeks ago). >> Little-endian: >> >> The semantics of the RTL pattern are: >> >> { 0L, 0U } = { X[I3], X[I4 + 2] }, where X = { 1L, 1U, 2L, 2U } >> >> so: 0L = { 1L, 1U }[I3] (= ) >> 0U = { 2L, 2U }[I4] (= ) >> >> = 2, = I4 ? U : L >> = 1, = I3 ? U : L >> >> [LL] !I4 && !I3 [UL] I4 && !I3 >> [LU] !I4 && I3[UU] I4 && I3 >> >> Big-endian: >> >> The semantics of the RTL pattern are: >> >> { 0U, 0L } = { X[I3], X[I4 + 2] }, where X = { 1U, 1L, 2U, 2L } >> >> so: 0U = { 1U, 1L }[I3] (= ) >> 0L = { 2U, 2L }[I4] (= ) >> >> = 1, = I3 ? L : U >> = 2, = I4 ? L : U >> >> [UU] !I3 && !I4 [UL] !I3 && I4 >> [LU] I3 && !I4[LL] I3 && I4. */ >> >> which suggests that the PUL and PLU entries for big-endian should be >> the other way around. Does that sound right, or have I misunderstood? > > Yes, that sounds right. > >> ...for little-endian, we need to pass the "U" and "L" components of the >> mnemonic in the reverse order: the MIPS instruction specifies the upper >> part first, whereas the rtl pattern specifies the lower part first. >> And for little-endian, U refers to memory element 1 and L to memory >> element 0. So I think this should be: > > ... Except that the actual output of the LE insn actually swaps the > operands too. So I think these expanders should not *also* swap the > operands. I've tidied these up a bit since then. Hmm, are you sure? The order of the operands passed to these p?? expanders is supposed to match the order of the operands in the final asm instruction. A user's "A = __builtin_mips_plu_ps (B, C)" corresponds to "gen_mips_plu_ps (A, B, C)", which must always generate "PLU.PS A, B, C", etc. So if the define_insn swaps the operands (which from above, it must for little-endian), then these expanders need to swap too, to undo the effect. Or, taking the longer version from yesterday: ;; Expanders for builtins. The instruction: ;; ;; P[UL][UL].PS , , ;; ;; says that the upper part of is taken from half of and ;; the lower part of is taken from half of . This means ;; that the P[UL][UL].PS operand order matches memory order on big-endian ;; targets; is element 0 of the V2SF result while is element 1. ;; However, the P[UL][UL].PS operand order is the reverse of memory order ;; on little-endian targets; is element 1 of the V2SF result while ;; is element 0. The arguments to vec_perm_const_ps are always in ;; memory order. ;; ;; Similarly, "U" corresponds to element 0 on big-endian targets but ;; to element 1 on little-endian targets. (would be nice to have these comments in the patch if nothing else). Because of that, I think I preferred the original style, with no SET rtl pattern in the expander, and calls to emit_insn (gen_...) in the C code. >> I think this is endian-dependent. For little-endian, the bottom two bits >> of the mask determine element 0; for big-endian, the top two bits of the >> mask do. > > Recall that loongson can only run in little-endian. Doh. > I added comments about that in the md file, but it would do no harm to > add another here. Thanks. Richard
Re: [PATCH 5/6] mips: Implement vec_perm_const.
On Sun, 11 Dec 2011, Richard Sandiford wrote: > Hans-Peter Nilsson writes: > > Please also consider incrementing __mips_loongson_vector_rev > For avoidance of doubt, that only applies to the latter ("as H-P > suggests") option. The patch as posted keeps the public interface > the same. Correct; I misread it, sorry. brgds, H-P
Re: [PATCH 5/6] mips: Implement vec_perm_const.
On 12/11/2011 04:50 AM, Richard Sandiford wrote: > [Mingjie, please could you help with the Loongson question near the end?] Actually, can you tell me how to test these abi combinations? I keep trying to use mips-sim or mips64-sim and get linker errors complaining of abi combinations. > Little-endian: > > The semantics of the RTL pattern are: > > { 0L, 0U } = { X[I3], X[I4 + 2] }, where X = { 1L, 1U, 2L, 2U } > > so: 0L = { 1L, 1U }[I3] (= ) > 0U = { 2L, 2U }[I4] (= ) > >= 2, = I4 ? U : L >= 1, = I3 ? U : L > > [LL] !I4 && !I3 [UL] I4 && !I3 > [LU] !I4 && I3[UU] I4 && I3 > > Big-endian: > > The semantics of the RTL pattern are: > > { 0U, 0L } = { X[I3], X[I4 + 2] }, where X = { 1U, 1L, 2U, 2L } > > so: 0U = { 1U, 1L }[I3] (= ) > 0L = { 2U, 2L }[I4] (= ) > >= 1, = I3 ? L : U >= 2, = I4 ? L : U > > [UU] !I3 && !I4 [UL] !I3 && I4 > [LU] I3 && !I4[LL] I3 && I4. */ > > which suggests that the PUL and PLU entries for big-endian should be > the other way around. Does that sound right, or have I misunderstood? Yes, that sounds right. > ...for little-endian, we need to pass the "U" and "L" components of the > mnemonic in the reverse order: the MIPS instruction specifies the upper > part first, whereas the rtl pattern specifies the lower part first. > And for little-endian, U refers to memory element 1 and L to memory > element 0. So I think this should be: ... Except that the actual output of the LE insn actually swaps the operands too. So I think these expanders should not *also* swap the operands. I've tidied these up a bit since then. >> +static bool >> +mips_expand_vpc_ps (struct expand_vec_perm_d *d) I've eliminated this function since then. >> + /* Convert the selector into the packed 8-bit form for pshufh. */ >> + for (i = mask = 0; i < 4; i++) >> +mask |= (d->perm[i] & 3) << (i * 2); > > I think this is endian-dependent. For little-endian, the bottom two bits > of the mask determine element 0; for big-endian, the top two bits of the > mask do. Recall that loongson can only run in little-endian. I added comments about that in the md file, but it would do no harm to add another here. > (There's a machine in the farm, but bootstrapping on it is rather slow.) Yeah, I started checking out the tree there yesterday and it never completed. > I think a lot of the endianness stuff in the patch is dependent on byte > endianness rather than word endianness. Since we only support two out > of the four combinations, it seems better not to worry which and simply > use TARGET_{BIG,LITTLE}_ENDIAN instead of {WORDS,BYTES}_{BIG,LITTLE}_ENDIAN. Sure. This is my current patch, which doesn't have the pul/plu insns swapped, as suggested above. I did change the loongson.h interface as H-P suggested. r~ commit b7790c7a9e53d66d1f348c3f2adb5b8a9bf2d93c Author: Richard Henderson Date: Wed Dec 7 14:17:02 2011 -0800 mips: Implement vec_perm_const. diff --git a/gcc/config/mips/loongson.h b/gcc/config/mips/loongson.h index 6bfd4d7..dfd6505 100644 --- a/gcc/config/mips/loongson.h +++ b/gcc/config/mips/loongson.h @@ -447,15 +447,15 @@ psadbh (uint8x8_t s, uint8x8_t t) /* Shuffle halfwords. */ __extension__ static __inline uint16x4_t __attribute__ ((__always_inline__)) -pshufh_u (uint16x4_t dest, uint16x4_t s, uint8_t order) +pshufh_u (uint16x4_t s, uint8_t order) { - return __builtin_loongson_pshufh_u (dest, s, order); + return __builtin_loongson_pshufh_u (s, order); } __extension__ static __inline int16x4_t __attribute__ ((__always_inline__)) -pshufh_s (int16x4_t dest, int16x4_t s, uint8_t order) +pshufh_s (int16x4_t s, uint8_t order) { - return __builtin_loongson_pshufh_s (dest, s, order); + return __builtin_loongson_pshufh_s (s, order); } /* Shift left logical. */ diff --git a/gcc/config/mips/loongson.md b/gcc/config/mips/loongson.md index 225f4d1..7c7e29f 100644 --- a/gcc/config/mips/loongson.md +++ b/gcc/config/mips/loongson.md @@ -24,10 +24,7 @@ UNSPEC_LOONGSON_PCMPEQ UNSPEC_LOONGSON_PCMPGT UNSPEC_LOONGSON_PEXTR - UNSPEC_LOONGSON_PINSR_0 - UNSPEC_LOONGSON_PINSR_1 - UNSPEC_LOONGSON_PINSR_2 - UNSPEC_LOONGSON_PINSR_3 + UNSPEC_LOONGSON_PINSRH UNSPEC_LOONGSON_PMADD UNSPEC_LOONGSON_PMOVMSK UNSPEC_LOONGSON_PMULHU @@ -200,6 +197,51 @@ "pandn\t%0,%1,%2" [(set_attr "type" "fmul")]) +;; Logical AND. +(define_insn "*loongson_and" + [(set (match_operand:VWHB 0 "register_operand" "=f") + (and:VWHB (match_operand:VWHB 1 "register_operand" "f") + (match_operand:VWHB 2 "register_operand" "f")))] + "TARGET_HARD_FLOAT && TARGET_LOONGSON_VECTORS" + "and\t%0,%1,%2" + [(set_attr "type" "fmul")]) + +;; Logical OR. +(define_insn "*loongson_or" + [(set (match_operand:VWHB 0 "register_operand" "=f") + (ior:VWHB (match_operand:VWHB 1 "register_ope
Re: [PATCH 5/6] mips: Implement vec_perm_const.
Hans-Peter Nilsson writes: > On Sun, 11 Dec 2011, Richard Sandiford wrote: >> [Mingjie, please could you help with the Loongson question near the end?] > >> As H-P mentioned, this changes the __builtin_* interface for the PSHUFH >> intrinsics. These intrinsics are supposed to be used via the inline >> wrappers in loongson.h, so we can either keep the unused argument in >> the pshufh_{u,s} or, as H-P suggests, remove the argument from both. >> I don't know which is better. loongson.h needs to change either way, >> so in the patch below, I went for the former. The latter would need >> testsuite changes too. Mingjie, which do you think is best? > > Please also consider incrementing __mips_loongson_vector_rev, or > if currently empty, set to 1. And mention PR48068 in the > changelog; fixed in part. For avoidance of doubt, that only applies to the latter ("as H-P suggests") option. The patch as posted keeps the public interface the same. Richard
Re: [PATCH 5/6] mips: Implement vec_perm_const.
On Sun, 11 Dec 2011, Richard Sandiford wrote: > [Mingjie, please could you help with the Loongson question near the end?] > As H-P mentioned, this changes the __builtin_* interface for the PSHUFH > intrinsics. These intrinsics are supposed to be used via the inline > wrappers in loongson.h, so we can either keep the unused argument in > the pshufh_{u,s} or, as H-P suggests, remove the argument from both. > I don't know which is better. loongson.h needs to change either way, > so in the patch below, I went for the former. The latter would need > testsuite changes too. Mingjie, which do you think is best? Please also consider incrementing __mips_loongson_vector_rev, or if currently empty, set to 1. And mention PR48068 in the changelog; fixed in part. (I can't see what builtin_define does, set it to 1 or just defined?) brgds, H-P
Re: [PATCH 5/6] mips: Implement vec_perm_const.
[Mingjie, please could you help with the Loongson question near the end?] Richard Henderson writes: > @@ -89,61 +89,102 @@ >DONE; > }) > > -; pul.ps - Pair Upper Lower > -(define_insn "mips_pul_ps" > +(define_insn "vec_perm_const_ps" >[(set (match_operand:V2SF 0 "register_operand" "=f") > - (vec_merge:V2SF > - (match_operand:V2SF 1 "register_operand" "f") > - (match_operand:V2SF 2 "register_operand" "f") > - (const_int 2)))] > + (vec_select:V2SF > + (vec_concat:V4SF > + (match_operand:V2SF 1 "register_operand" "f") > + (match_operand:V2SF 2 "register_operand" "f")) > + (parallel [(match_operand:SI 3 "const_0_or_1_operand" "") > + (match_operand:SI 4 "const_2_or_3_operand" "")])))] >"TARGET_HARD_FLOAT && TARGET_PAIRED_SINGLE_FLOAT" > - "pul.ps\t%0,%1,%2" > +{ > + static const int * const mnemonics[2][4] = { > +/* LE */ { "pll.ps\t%0,%2,%1", "pul.ps\t%0,%2,%1", > +"plu.ps\t%0,%2,%1", "puu.ps\t%0,%2,%1" }, > +/* BE */ { "puu.ps\t%0,%1,%2", "plu.ps\t%0,%1,%2", > +"pul.ps\t%0,%1,%2", "pll.ps\t%0,%1,%2" }, > + }; > + > + unsigned mask = INTVAL (operands[3]) * 2 + (INTVAL (operands[4]) - 2); > + return mnemonics[WORDS_BIG_ENDIAN][mask]; > +} So I stared at this for fully an hour trying to work out all the various orderings (vec_concat operands always in memory order, parallel selector always in memory order, GCC vector element 0 being "upper" on big-endian and "lower" on little-endian, P??.PS always specifying the upper part of the result first, etc.). I ended up with: /* Let L be the lower part of operand and U be the upper part. The P[UL][UL].PS instruction always specifies the upper part of the result first, so the instruction is: P.PS %0,, where 0U == and 0L == . GCC's vector indices are specified in memory order, which means that vector element 0 is the lower part (L) on little-endian targets and the upper part (U) on big-endian targets. vec_concat likewise concatenates in memory order, which means that operand 3 (being 0 or 1) selects part of operand 1 and operand 4 (being 2 or 3) selects part of operand 2. Let: I3 = INTVAL (operands[3]) I4 = INTVAL (operands[4]) - 2 Taking the two endiannesses in turn: Little-endian: The semantics of the RTL pattern are: { 0L, 0U } = { X[I3], X[I4 + 2] }, where X = { 1L, 1U, 2L, 2U } so: 0L = { 1L, 1U }[I3] (= ) 0U = { 2L, 2U }[I4] (= ) = 2, = I4 ? U : L = 1, = I3 ? U : L [LL] !I4 && !I3 [UL] I4 && !I3 [LU] !I4 && I3[UU] I4 && I3 Big-endian: The semantics of the RTL pattern are: { 0U, 0L } = { X[I3], X[I4 + 2] }, where X = { 1U, 1L, 2U, 2L } so: 0U = { 1U, 1L }[I3] (= ) 0L = { 2U, 2L }[I4] (= ) = 1, = I3 ? L : U = 2, = I4 ? L : U [UU] !I3 && !I4 [UL] !I3 && I4 [LU] I3 && !I4[LL] I3 && I4. */ which suggests that the PUL and PLU entries for big-endian should be the other way around. Does that sound right, or have I misunderstood? (Also, "const char *" rather than "const int *".) The same confusion hit me with the expanders: > +(define_expand "mips_pul_ps" > + [(match_operand:V2SF 0 "register_operand" "") > + (match_operand:V2SF 1 "register_operand" "") > + (match_operand:V2SF 2 "register_operand" "")] > + "TARGET_HARD_FLOAT && TARGET_PAIRED_SINGLE_FLOAT" > +{ > + if (WORDS_BIG_ENDIAN) > +emit_insn (gen_vec_perm_const_ps (operands[0], operands[1], operands[2], > + const0_rtx, const2_rtx)); > + else > +emit_insn (gen_vec_perm_const_ps (operands[0], operands[2], operands[1], > + const1_rtx, GEN_INT (3))); > + DONE; > +}) This one looks like a pasto: the operands given here are the same as for mips_puu_ps. But... > +(define_expand "mips_plu_ps" > + [(match_operand:V2SF 0 "register_operand" "") > + (match_operand:V2SF 1 "register_operand" "") > + (match_operand:V2SF 2 "register_operand" "")] > + "TARGET_HARD_FLOAT && TARGET_PAIRED_SINGLE_FLOAT" > +{ > + if (WORDS_BIG_ENDIAN) > +emit_insn (gen_vec_perm_const_ps (operands[0], operands[1], operands[2], > + const1_rtx, const2_rtx)); > + else > +emit_insn (gen_vec_perm_const_ps (operands[0], operands[2], operands[1], > + const0_rtx, GEN_INT (3))); > + DONE; > +}) ...for little-endian, we need to pass the "U" and "L" components of the mnemonic in the reverse order: the MIPS instruction specifies the upper part first, whereas the rtl pattern specifies the lower part first. And for little-endian, U refers to memory element 1 and L to memory element 0. So I think this should be: if (WORDS_BIG_ENDIAN) emit_insn (gen_vec_perm_const_ps (oper
Re: [PATCH 5/6] mips: Implement vec_perm_const.
On 12/08/2011 10:08 PM, Hans-Peter Nilsson wrote: > On Thu, 8 Dec 2011, Richard Henderson wrote: >> diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c >> index d3fd709..f1c3665 100644 >> --- a/gcc/config/mips/mips.c >> +++ b/gcc/config/mips/mips.c > >> @@ -13021,8 +13015,8 @@ static const struct mips_builtin_description >> mips_builtins[] = { >>LOONGSON_BUILTIN (pasubub, MIPS_UV8QI_FTYPE_UV8QI_UV8QI), >>LOONGSON_BUILTIN (biadd, MIPS_UV4HI_FTYPE_UV8QI), >>LOONGSON_BUILTIN (psadbh, MIPS_UV4HI_FTYPE_UV8QI_UV8QI), >> - LOONGSON_BUILTIN_SUFFIX (pshufh, u, MIPS_UV4HI_FTYPE_UV4HI_UV4HI_UQI), >> - LOONGSON_BUILTIN_SUFFIX (pshufh, s, MIPS_V4HI_FTYPE_V4HI_V4HI_UQI), >> + LOONGSON_BUILTIN_SUFFIX (pshufh, u, MIPS_UV4HI_FTYPE_UV4HI_UQI), >> + LOONGSON_BUILTIN_SUFFIX (pshufh, s, MIPS_V4HI_FTYPE_V4HI_UQI), >>LOONGSON_BUILTIN_SUFFIX (psllh, u, MIPS_UV4HI_FTYPE_UV4HI_UQI), > > Looks like a brute-force (ignoring backward compatibility) fix > for PR48068 item 2. If going that route, I'd suggest at least > increment the __mips_loongson_vector_rev. Also, loongson.h > needs the corresponding adjustment. Thanks for the pointer. I'll clean this up along the increment revision line, unless Richard S has another suggestion. r~
Re: [PATCH 5/6] mips: Implement vec_perm_const.
On Thu, 8 Dec 2011, Richard Henderson wrote: > diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c > index d3fd709..f1c3665 100644 > --- a/gcc/config/mips/mips.c > +++ b/gcc/config/mips/mips.c > @@ -13021,8 +13015,8 @@ static const struct mips_builtin_description > mips_builtins[] = { >LOONGSON_BUILTIN (pasubub, MIPS_UV8QI_FTYPE_UV8QI_UV8QI), >LOONGSON_BUILTIN (biadd, MIPS_UV4HI_FTYPE_UV8QI), >LOONGSON_BUILTIN (psadbh, MIPS_UV4HI_FTYPE_UV8QI_UV8QI), > - LOONGSON_BUILTIN_SUFFIX (pshufh, u, MIPS_UV4HI_FTYPE_UV4HI_UV4HI_UQI), > - LOONGSON_BUILTIN_SUFFIX (pshufh, s, MIPS_V4HI_FTYPE_V4HI_V4HI_UQI), > + LOONGSON_BUILTIN_SUFFIX (pshufh, u, MIPS_UV4HI_FTYPE_UV4HI_UQI), > + LOONGSON_BUILTIN_SUFFIX (pshufh, s, MIPS_V4HI_FTYPE_V4HI_UQI), >LOONGSON_BUILTIN_SUFFIX (psllh, u, MIPS_UV4HI_FTYPE_UV4HI_UQI), Looks like a brute-force (ignoring backward compatibility) fix for PR48068 item 2. If going that route, I'd suggest at least increment the __mips_loongson_vector_rev. Also, loongson.h needs the corresponding adjustment. (No specific interest in Loongson, FWIW.) brgds, H-P
[PATCH 5/6] mips: Implement vec_perm_const.
--- gcc/config/mips/loongson.md| 24 +++- gcc/config/mips/mips-modes.def |1 + gcc/config/mips/mips-protos.h |1 + gcc/config/mips/mips-ps-3d.md | 145 ++ gcc/config/mips/mips.c | 266 ++-- gcc/config/mips/predicates.md |7 +- 6 files changed, 376 insertions(+), 68 deletions(-) diff --git a/gcc/config/mips/loongson.md b/gcc/config/mips/loongson.md index 225f4d1..23c37d7 100644 --- a/gcc/config/mips/loongson.md +++ b/gcc/config/mips/loongson.md @@ -403,12 +403,11 @@ ;; Shuffle halfwords. (define_insn "loongson_pshufh" [(set (match_operand:VH 0 "register_operand" "=f") -(unspec:VH [(match_operand:VH 1 "register_operand" "0") - (match_operand:VH 2 "register_operand" "f") - (match_operand:SI 3 "register_operand" "f")] +(unspec:VH [(match_operand:VH 1 "register_operand" "f") + (match_operand:SI 2 "register_operand" "f")] UNSPEC_LOONGSON_PSHUFH))] "TARGET_HARD_FLOAT && TARGET_LOONGSON_VECTORS" - "pshufh\t%0,%2,%3" + "pshufh\t%0,%1,%2" [(set_attr "type" "fmul")]) ;; Shift left logical. @@ -479,7 +478,7 @@ [(set_attr "type" "fadd")]) ;; Unpack high data. -(define_insn "vec_interleave_high" +(define_insn "loongson_punpckh" [(set (match_operand:VWHB 0 "register_operand" "=f") (unspec:VWHB [(match_operand:VWHB 1 "register_operand" "f") (match_operand:VWHB 2 "register_operand" "f")] @@ -489,7 +488,7 @@ [(set_attr "type" "fdiv")]) ;; Unpack low data. -(define_insn "vec_interleave_low" +(define_insn "loongson_punpckl" [(set (match_operand:VWHB 0 "register_operand" "=f") (unspec:VWHB [(match_operand:VWHB 1 "register_operand" "f") (match_operand:VWHB 2 "register_operand" "f")] @@ -498,6 +497,19 @@ "punpckl\t%0,%1,%2" [(set_attr "type" "fdiv")]) +(define_expand "vec_perm_const" + [(match_operand:VWHB 0 "register_operand" "") + (match_operand:VWHB 1 "register_operand" "") + (match_operand:VWHB 2 "register_operand" "") + (match_operand:VWHB 3 "" "")] + "TARGET_HARD_FLOAT && TARGET_LOONGSON_VECTORS" +{ + if (mips_expand_vec_perm_const (operands)) +DONE; + else +FAIL; +}) + ;; Integer division and modulus. For integer multiplication, see mips.md. (define_insn "div3" diff --git a/gcc/config/mips/mips-modes.def b/gcc/config/mips/mips-modes.def index b9c508b..03b9632 100644 --- a/gcc/config/mips/mips-modes.def +++ b/gcc/config/mips/mips-modes.def @@ -29,6 +29,7 @@ FLOAT_MODE (TF, 16, mips_quad_format); VECTOR_MODES (INT, 8);/* V8QI V4HI V2SI */ VECTOR_MODES (FLOAT, 8); /*V4HF V2SF */ VECTOR_MODES (INT, 4);/*V4QI V2HI */ +VECTOR_MODES (FLOAT, 16); VECTOR_MODES (FRACT, 4); /* V4QQ V2HQ */ VECTOR_MODES (UFRACT, 4); /* V4UQQ V2UHQ */ diff --git a/gcc/config/mips/mips-protos.h b/gcc/config/mips/mips-protos.h index dbabdff..37c958d 100644 --- a/gcc/config/mips/mips-protos.h +++ b/gcc/config/mips/mips-protos.h @@ -328,6 +328,7 @@ extern void mips_expand_atomic_qihi (union mips_gen_fn_ptrs, rtx, rtx, rtx, rtx); extern void mips_expand_vector_init (rtx, rtx); +extern bool mips_expand_vec_perm_const (rtx op[4]); extern bool mips_eh_uses (unsigned int); extern bool mips_epilogue_uses (unsigned int); diff --git a/gcc/config/mips/mips-ps-3d.md b/gcc/config/mips/mips-ps-3d.md index 504f43c..d81abf8 100644 --- a/gcc/config/mips/mips-ps-3d.md +++ b/gcc/config/mips/mips-ps-3d.md @@ -89,61 +89,102 @@ DONE; }) -; pul.ps - Pair Upper Lower -(define_insn "mips_pul_ps" +(define_insn "vec_perm_const_ps" [(set (match_operand:V2SF 0 "register_operand" "=f") - (vec_merge:V2SF -(match_operand:V2SF 1 "register_operand" "f") -(match_operand:V2SF 2 "register_operand" "f") -(const_int 2)))] + (vec_select:V2SF + (vec_concat:V4SF + (match_operand:V2SF 1 "register_operand" "f") + (match_operand:V2SF 2 "register_operand" "f")) + (parallel [(match_operand:SI 3 "const_0_or_1_operand" "") +(match_operand:SI 4 "const_2_or_3_operand" "")])))] "TARGET_HARD_FLOAT && TARGET_PAIRED_SINGLE_FLOAT" - "pul.ps\t%0,%1,%2" +{ + static const int * const mnemonics[2][4] = { +/* LE */ { "pll.ps\t%0,%2,%1", "pul.ps\t%0,%2,%1", + "plu.ps\t%0,%2,%1", "puu.ps\t%0,%2,%1" }, +/* BE */ { "puu.ps\t%0,%1,%2", "plu.ps\t%0,%1,%2", + "pul.ps\t%0,%1,%2", "pll.ps\t%0,%1,%2" }, + }; + + unsigned mask = INTVAL (operands[3]) * 2 + (INTVAL (operands[4]) - 2); + return mnemonics[WORDS_BIG_ENDIAN][mask]; +} [(set_attr "type" "fmove") (set_attr "mode" "SF")]) -; puu.ps - Pair upper upper -(define_insn "mips_puu_ps" - [(set (match_operand:V2SF 0 "register_operand" "=f") - (vec_merge:V2SF -(match_operand:V2SF 1 "register_operand"