Re: [PATCH 12/43] i386: Emulate MMX vec_dupv2si with SSE

2019-02-10 Thread Uros Bizjak
On Mon, Feb 11, 2019 at 2:04 AM H.J. Lu  wrote:
>
> On Sun, Feb 10, 2019 at 1:49 PM Uros Bizjak  wrote:
> >
> > On Sun, Feb 10, 2019 at 10:45 PM Uros Bizjak  wrote:
> >
> > > > > > +  [(const_int 0)]
> > > > > > +{
> > > > > > +  /* Emulate MMX vec_dupv2si with SSE vec_dupv4si.  */
> > > > > > +  rtx op0 = gen_rtx_REG (V4SImode, REGNO (operands[0]));
> > > > > > +  rtx insn = gen_vec_dupv4si (op0, operands[1]);
> > > > > > +  emit_insn (insn);
> > > > > > +  DONE;
> > > > >
> > > > > Please write this simple RTX explicitly in the place of (const_int 0) 
> > > > > above.
> > > >
> > > > rtx insn = gen_vec_dupv4si (op0, operands[1]);
> > > >
> > > > is easy.   How do I write
> > > >
> > > > rtx op0 = gen_rtx_REG (V4SImode, REGNO (operands[0]));
> > > >
> > > > in place of  (const_int 0)?
> > >
> > >   [(set (match_dup 2)
> > > (vec_duplicate:V4SI (match_dup 1)))]
> > >
> > > with
> > >
> > > "operands[2] = gen_rtx_REG (V4SImode, REGNO (operands[0]));"
> > >
> > > or even better:
> > >
> > > "operands[2] = gen_lowpart (V4SImode, operands[0]);"
> > >
> > > in the preparation statement.
> >
> > Even shorter is
> >
> > "operands[0] = gen_lowpart (V4SImode, operands[0]);"
> >
> > and use (match_dup 0) instead of (match_dup 2) in the RTX.
> >
> > There is plenty of examples throughout sse.md.
> >
>
> This works:
>
> (define_insn_and_split "*vec_dupv2si"
>   [(set (match_operand:V2SI 0 "register_operand" "=y,x,Yv")
> (vec_duplicate:V2SI
>   (match_operand:SI 1 "register_operand" "0,0,Yv")))]
>   "TARGET_MMX || TARGET_MMX_WITH_SSE"
>   "@
>punpckldq\t%0, %0
>#
>#"
>   "TARGET_MMX_WITH_SSE && reload_completed"
>   [(set (match_dup 0)
> (vec_duplicate:V4SI (match_dup 1)))]
>   "operands[0] = gen_rtx_REG (V4SImode, REGNO (operands[0]));"
>   [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
>(set_attr "type" "mmxcvt,ssemov,ssemov")
>(set_attr "mode" "DI,TI,TI")])

If it works, then gen_lowpart is preferred due to extra checks.
However, it would result in a paradoxical subreg, so I wonder if these
extra checks allow this transformation.

Uros.


Re: [PATCH] i386: Use OI/TImode in *mov[ot]i_internal_avx with AVX512VL

2019-02-10 Thread Uros Bizjak
On Mon, Feb 11, 2019 at 3:35 AM Alan Modra  wrote:
>
> On Fri, Feb 08, 2019 at 10:51:34AM +0100, Uros Bizjak wrote:
> > On Thu, Feb 7, 2019 at 10:11 PM H.J. Lu  wrote:
> > >
> > > OImode and TImode moves must be done in XImode to access upper 16
> > > vector registers without AVX512VL.  With AVX512VL, we can access
> > > upper 16 vector registers in OImode and TImode.
> > >
> > > PR target/89229
> > > * config/i386/i386.md (*movoi_internal_avx): Set mode to XI for
> > > upper 16 vector registers without TARGET_AVX512VL.
> > > (*movti_internal): Likewise.
> >
> > Please use (not (match_test "...")) instead of (match_test "!...") and
>
> I'm curious.  Is there a reason other than style to ask for this
> change?

It is style that we want to keep throughout i386 *.md files,
otherwise, it should result in identical code.

Uros.


Re: Follow-up-fix 2 to "[PATCH] Move PR84877 fix elsewhere (PR bootstrap/88450)"

2019-02-10 Thread Richard Biener
On February 11, 2019 2:09:30 AM GMT+01:00, Hans-Peter Nilsson 
 wrote:
>Here's the follow-up, getting rid of the observed
>alignment-padding in execute/930126-1.c: the x parameter in f
>spuriously being runtime-aligned to BITS_PER_WORD.  I separated
>this change because this is an older issue, a change introduced
>in r94104 where BITS_PER_WORD was chosen perhaps because we
>expect register-sized writes into this area.  Here, we instead
>align to a minimum of PREFERRED_STACK_BOUNDARY, but of course
>gated on !  STRICT_ALIGNMENT.
>
>Regtested cris-elf and x86_64-pc-linux-gnu.
>
>Ok to commit?
>
>gcc:
>   * function.c (assign_parm_setup_block): If not STRICT_ALIGNMENT,
>   instead of always BITS_PER_WORD, align the stacked
>   parameter to a minimum PREFERRED_STACK_BOUNDARY.
>
>--- function.c.orig2   Sat Feb  9 00:53:17 2019
>+++ function.c Sat Feb  9 23:21:35 2019
>@@ -2912,7 +2912,10 @@ assign_parm_setup_block (struct assign_p
>   size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
>   if (stack_parm == 0)
> {
>-  SET_DECL_ALIGN (parm, MAX (DECL_ALIGN (parm), BITS_PER_WORD));
>+  HOST_WIDE_INT min_parm_align
>+  = STRICT_ALIGNMENT ? BITS_PER_WORD : PREFERRED_STACK_BOUNDARY;

Shouldn't it be MIN (...) of BOTH? 

>+
>+  SET_DECL_ALIGN (parm, MAX (DECL_ALIGN (parm), min_parm_align));
>   if (DECL_ALIGN (parm) > MAX_SUPPORTED_STACK_ALIGNMENT)
>   {
> rtx allocsize = gen_int_mode (size_stored, Pmode);
>
>brgds, H-P



Re: [PATCH] Updated patches for the port of gccgo to GNU/Hurd

2019-02-10 Thread Ian Lance Taylor
On Sun, Feb 10, 2019 at 3:41 AM Svante Signell  wrote:
>
> On Sat, 2019-02-09 at 23:57 +0100, Svante Signell wrote:
> > On Sat, 2019-02-09 at 14:40 -0800, Ian Lance Taylor wrote:
> > > On Fri, Feb 8, 2019 at 3:07 PM Matthias Klose  wrote:
> > > > On 07.02.19 06:04, Ian Lance Taylor wrote:
> > > What are the lines before that in the log?  For some reason libtool is
> > > being invoke with no source files.  The lines before the failing line
> > > should show an invocation of match.sh that determines the source
> > > files.
> >
> > Thanks for your job upstreaming the patches!
> >
> > I've found some problems. Current problem is with the mksysinfo.sh patch. 
> > But
> > there are some other things missing. New patches will be submitted tomorrow.
>
> Attached are three additional patches needed to build libgo on GNU/Hurd:
> src_libgo_mksysinfo.sh.diff
> src_libgo_go_syscall_wait.c.diff
> src_libgo_testsuite_gotest.diff
>
> For the first patch, src_libgo_mksysinfo.sh.diff, I had to go back to the old
> version, using sed -i -e. As written now ${fsid_to_dev} expands to
> fsid_to_dev='-e '\''s/st_fsid/Dev/'\''' resulting in: "sed: -e expression #4,
> char 1: unknown command: `''". Unfortunately, I have not yet been able to 
> modify
> the expansion omitting the single qoutes around the shell variable.

I'm sorry, I don't want to use "sed -i".  That loses the original file
and makes it harder to reconstruct what has happened.

> The second patch, src_libgo_go_syscall_wait.c.diff, is needed since WCONTINUED
> is not defined and is needed for WIFCONTINUED to be defined in wait.h.

I don't understand that.   is a system header file.  Are
you saying that it is impossible to use  and WIFCONTINUED
unless your source code does a #define WCONTINUED before #include'ing
?  That seems like a bug in the Hurd library code.

> The third patch, src_libgo_testsuite_gotest.diff, is not strictly needed, but
> running the tests the annoying text is displayed: "ps: comm: Unknown format
> spec"

I get that "comm" doesn't work, but the change in that patch is simply
incorrect.  If you don't pass "comm", the "grep sleep" will never
succeed.  If there is no way to support this code on Hurd then we
should skip it, not put in a command that can never work.

Ian


Re: [PATCH] rs6000: Vector shift-right should honor modulo semantics

2019-02-10 Thread Bill Schmidt
On 2/10/19 8:42 PM, Bill Schmidt wrote:
> On 2/10/19 4:05 PM, Segher Boessenkool wrote:
>> Hi Bill,
>>
>> On Sun, Feb 10, 2019 at 10:10:02AM -0600, Bill Schmidt wrote:
>>> I've added executable tests for both shift-right algebraic and shift-right 
>>> logical.
>>> Both fail prior to applying the patch, and work correctly afterwards.
>> Please add a test for left shifts, as well?
> Can do.  I verified that left shifts were not broken and figured a test case
> had been added then, but have not checked.  Good to test this particular
> scenario, though.
>
>>> 2019-02-08  Bill Schmidt  
>>>
>>> * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Shift-right
>>> and shift-left vector built-ins need to include a TRUNC_MOD_EXPR
>>> for correct semantics.  Also, don't expand a vector-splat if there
>>> is a type mismatch; let the back end handle it.
>> Does it always result in just the shift instruction now?  Does the modulo
>> ever remain?  (Maybe at -O0?)  Modulo is hugely expensive; it is always
>> modulo by a power of two, so a simple bitmask, so maybe write that directly
>> instead?
> We always get the shift.  For -mcpu=power8, we always load the mask from
> memory rather than generating the vspltisb, which is not ideal code 
> generation,
> but is at least correct.
>
> For -mcpu=power9, we get close, but have some bad register allocation and
> an unnecessary extend:
>
> xxspltib 0,4   <- why not just xxspltib 32,4?
> xxlor 32,0,0   <- wasted copy
> vextsb2d 0,0   <- unnecessary due to vsrad semantics
> vsrad 2,2,0
>
> Again, this is at least correct.  We have more work to do to produce the
> most efficient code, but we have PR89213 open for that.
>
>>> 2019-02-08  Bill Schmidt  
>>>
>>> * gcc.target/powerpc/srad-modulo.c: New.
>>> * gcc.target/powerpc/srd-modulo.c: New.
>> Please put "vec-" in the testcase name.  You may need to rename vec-shift.c
>> and/or vec-shr.c, which are not as generic as their names suggest.
> Ok.
>
>>> @@ -16072,6 +16116,13 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *
>>> arg0 = gimple_call_arg (stmt, 0);
>>> lhs = gimple_call_lhs (stmt);
>>>  
>>> +   /* It's sometimes useful to use one of these to build a
>>> +  splat for V2DImode, since the upper bits will be ignored.
>>> +  Don't fold if we detect that situation, as we end up
>>> +  losing the splat instruction in this case.  */
>>> +   if (size != TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (TREE_TYPE (lhs)
>>> + return false;
>> This isn't really detecting that situation...  Instead, it doesn't fold
>> whenever the size of the splatted elements isn't the same as the size of
>> the elts of the target vector.  That's probably perfectly safe, but please
>> spell it out.  It's fine to mention the motivating case, of course.
> Yep, will correct.  Actually, as I look back at my notes, I believe that this
> change is not necessary after all (same code generated with and without it).
> I'll verify.
>
>>> Index: gcc/testsuite/gcc.target/powerpc/srad-modulo.c
>>> ===
>>> --- gcc/testsuite/gcc.target/powerpc/srad-modulo.c  (nonexistent)
>>> +++ gcc/testsuite/gcc.target/powerpc/srad-modulo.c  (working copy)
>>> @@ -0,0 +1,43 @@
>>> +/* Test that using a character splat to set up a shift-right algebraic
>>> +   for a doubleword vector works correctly after gimple folding.  */
>>> +
>>> +/* { dg-do run { target { powerpc64*-*-* && vsx_hw } } } */
>>> +/* { dg-options { "-O3" } } */
>> powerpc64*-*-* is almost always wrong.  I don't think you need to limit
>> to 64-bit at all here, but if you do, test for lp64 instead.
> Ok.
>
>> Testing for vsx_hw but not enabling vsx is probably wrong, too.
> Weird.  I just tried adding -mvsx and I get this peculiar error we've seen
> before about AMD graphics card offloading:
>
> spawn -ignore SIGHUP /home/wschmidt/gcc/build/gcc-mainline-test2/gcc/xgcc 
> -B/home/wschmidt/gcc/build/gcc-mainline-test2/gcc/ 
> /home/wschmidt/gcc/gcc-mainline-t\
> est2/gcc/testsuite/gcc.target/powerpc/srad-modulo.c 
> -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers 
> -fdiagnostics-color=never -O2 -mvsx -lm -o \
> ./srad-modulo.exe^M
> ^[[01m^[[Kcc1:^[[m^[[K ^[[01;31m^[[Kerror: ^[[m^[[Kargument to 
> '^[[01m^[[K-O^[[m^[[K' should be a non-negative integer, 
> '^[[01m^[[Kg^[[m^[[K', '^[[01m^[[Ks^[[\
> m^[[K' or '^[[01m^[[Kfast^[[m^[[K'^M
> compiler exited with status 1
> Executing on host: /home/wschmidt/gcc/build/gcc-mainline-test2/gcc/xgcc 
> -B/home/wschmidt/gcc/build/gcc-mainline-test2/gcc/ offload_gcn7262.c
> -fno-diagnosti\
> cs-show-caret -fno-diagnostics-show-line-numbers -fdiagnostics-color=never  
> -foffload=amdgcn-unknown-amdhsa -S -o offload_gcn7262.s(timeout = 300)
> spawn -ignore SIGHUP /home/wschmidt/gcc/build/gcc-mainline-test2/gcc/xgcc 
> -B/home/wschmidt/gcc/build/gcc-mainline-test2/gcc/ offload_gcn7262.c 
> -fno-diagnostic\
> 

Re: [PATCH] rs6000: Vector shift-right should honor modulo semantics

2019-02-10 Thread Bill Schmidt
On 2/10/19 4:05 PM, Segher Boessenkool wrote:
> Hi Bill,
>
> On Sun, Feb 10, 2019 at 10:10:02AM -0600, Bill Schmidt wrote:
>> I've added executable tests for both shift-right algebraic and shift-right 
>> logical.
>> Both fail prior to applying the patch, and work correctly afterwards.
> Please add a test for left shifts, as well?

Can do.  I verified that left shifts were not broken and figured a test case
had been added then, but have not checked.  Good to test this particular
scenario, though.

>
>> 2019-02-08  Bill Schmidt  
>>
>>  * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Shift-right
>>  and shift-left vector built-ins need to include a TRUNC_MOD_EXPR
>>  for correct semantics.  Also, don't expand a vector-splat if there
>>  is a type mismatch; let the back end handle it.
> Does it always result in just the shift instruction now?  Does the modulo
> ever remain?  (Maybe at -O0?)  Modulo is hugely expensive; it is always
> modulo by a power of two, so a simple bitmask, so maybe write that directly
> instead?

We always get the shift.  For -mcpu=power8, we always load the mask from
memory rather than generating the vspltisb, which is not ideal code generation,
but is at least correct.

For -mcpu=power9, we get close, but have some bad register allocation and
an unnecessary extend:

xxspltib 0,4   <- why not just xxspltib 32,4?
xxlor 32,0,0   <- wasted copy
vextsb2d 0,0   <- unnecessary due to vsrad semantics
vsrad 2,2,0

Again, this is at least correct.  We have more work to do to produce the
most efficient code, but we have PR89213 open for that.

>
>> 2019-02-08  Bill Schmidt  
>>
>>  * gcc.target/powerpc/srad-modulo.c: New.
>>  * gcc.target/powerpc/srd-modulo.c: New.
> Please put "vec-" in the testcase name.  You may need to rename vec-shift.c
> and/or vec-shr.c, which are not as generic as their names suggest.

Ok.

>
>> @@ -16072,6 +16116,13 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *
>>  arg0 = gimple_call_arg (stmt, 0);
>>  lhs = gimple_call_lhs (stmt);
>>  
>> +/* It's sometimes useful to use one of these to build a
>> +   splat for V2DImode, since the upper bits will be ignored.
>> +   Don't fold if we detect that situation, as we end up
>> +   losing the splat instruction in this case.  */
>> +if (size != TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (TREE_TYPE (lhs)
>> +  return false;
> This isn't really detecting that situation...  Instead, it doesn't fold
> whenever the size of the splatted elements isn't the same as the size of
> the elts of the target vector.  That's probably perfectly safe, but please
> spell it out.  It's fine to mention the motivating case, of course.

Yep, will correct.  Actually, as I look back at my notes, I believe that this
change is not necessary after all (same code generated with and without it).
I'll verify.

>
>> Index: gcc/testsuite/gcc.target/powerpc/srad-modulo.c
>> ===
>> --- gcc/testsuite/gcc.target/powerpc/srad-modulo.c   (nonexistent)
>> +++ gcc/testsuite/gcc.target/powerpc/srad-modulo.c   (working copy)
>> @@ -0,0 +1,43 @@
>> +/* Test that using a character splat to set up a shift-right algebraic
>> +   for a doubleword vector works correctly after gimple folding.  */
>> +
>> +/* { dg-do run { target { powerpc64*-*-* && vsx_hw } } } */
>> +/* { dg-options { "-O3" } } */
> powerpc64*-*-* is almost always wrong.  I don't think you need to limit
> to 64-bit at all here, but if you do, test for lp64 instead.

Ok.

>
> Testing for vsx_hw but not enabling vsx is probably wrong, too.

Weird.  I just tried adding -mvsx and I get this peculiar error we've seen
before about AMD graphics card offloading:

spawn -ignore SIGHUP /home/wschmidt/gcc/build/gcc-mainline-test2/gcc/xgcc 
-B/home/wschmidt/gcc/build/gcc-mainline-test2/gcc/ 
/home/wschmidt/gcc/gcc-mainline-t\
est2/gcc/testsuite/gcc.target/powerpc/srad-modulo.c -fno-diagnostics-show-caret 
-fno-diagnostics-show-line-numbers -fdiagnostics-color=never -O2 -mvsx -lm -o \
./srad-modulo.exe^M
^[[01m^[[Kcc1:^[[m^[[K ^[[01;31m^[[Kerror: ^[[m^[[Kargument to 
'^[[01m^[[K-O^[[m^[[K' should be a non-negative integer, '^[[01m^[[Kg^[[m^[[K', 
'^[[01m^[[Ks^[[\
m^[[K' or '^[[01m^[[Kfast^[[m^[[K'^M
compiler exited with status 1
Executing on host: /home/wschmidt/gcc/build/gcc-mainline-test2/gcc/xgcc 
-B/home/wschmidt/gcc/build/gcc-mainline-test2/gcc/ offload_gcn7262.c
-fno-diagnosti\
cs-show-caret -fno-diagnostics-show-line-numbers -fdiagnostics-color=never  
-foffload=amdgcn-unknown-amdhsa -S -o offload_gcn7262.s(timeout = 300)
spawn -ignore SIGHUP /home/wschmidt/gcc/build/gcc-mainline-test2/gcc/xgcc 
-B/home/wschmidt/gcc/build/gcc-mainline-test2/gcc/ offload_gcn7262.c 
-fno-diagnostic\
s-show-caret -fno-diagnostics-show-line-numbers -fdiagnostics-color=never 
-foffload=amdgcn-unknown-amdhsa -S -o offload_gcn7262.s^M
xgcc: fatal error: GCC is not 

Re: [PATCH] i386: Use OI/TImode in *mov[ot]i_internal_avx with AVX512VL

2019-02-10 Thread Alan Modra
On Fri, Feb 08, 2019 at 10:51:34AM +0100, Uros Bizjak wrote:
> On Thu, Feb 7, 2019 at 10:11 PM H.J. Lu  wrote:
> >
> > OImode and TImode moves must be done in XImode to access upper 16
> > vector registers without AVX512VL.  With AVX512VL, we can access
> > upper 16 vector registers in OImode and TImode.
> >
> > PR target/89229
> > * config/i386/i386.md (*movoi_internal_avx): Set mode to XI for
> > upper 16 vector registers without TARGET_AVX512VL.
> > (*movti_internal): Likewise.
> 
> Please use (not (match_test "...")) instead of (match_test "!...") and

I'm curious.  Is there a reason other than style to ask for this
change?

-- 
Alan Modra
Australia Development Lab, IBM


Committed, config/cris/cris.c: spell "minimum" correctly.

2019-02-10 Thread Hans-Peter Nilsson
Spotted while in a recent gdb session.  JFTR, not mine...  Committed.

Index: ChangeLog
===
--- ChangeLog   (revision 268759)
+++ ChangeLog   (revision 268760)
@@ -1,3 +1,8 @@
+2019-02-11  Hans-Peter Nilsson  
+
+   * config/cris/cris.c (cris_preferred_minimum_alignment): Fix name
+   typo.
+
 2019-02-10  H.J. Lu  
 
* config/i386/constraints.md (Yd): Replace AVX512BW with AVX512DQ
Index: config/cris/cris.c
===
--- config/cris/cris.c  (revision 268759)
+++ config/cris/cris.c  (revision 268760)
@@ -4340,7 +4340,7 @@ cris_hard_regno_mode_ok (unsigned int re
 /* Return the preferred minimum alignment for a static object.  */
 
 static HOST_WIDE_INT
-cris_preferred_mininum_alignment (void)
+cris_preferred_minimum_alignment (void)
 {
   if (!TARGET_CONST_ALIGN)
 return 8;
@@ -4354,7 +4354,7 @@ cris_preferred_mininum_alignment (void)
 static HOST_WIDE_INT
 cris_static_rtx_alignment (machine_mode mode)
 {
-  return MAX (cris_preferred_mininum_alignment (), GET_MODE_ALIGNMENT (mode));
+  return MAX (cris_preferred_minimum_alignment (), GET_MODE_ALIGNMENT (mode));
 }
 
 /* Implement TARGET_CONSTANT_ALIGNMENT.  Note that this hook has the
@@ -4367,7 +4367,7 @@ cris_static_rtx_alignment (machine_mode 
 static HOST_WIDE_INT
 cris_constant_alignment (const_tree, HOST_WIDE_INT basic_align)
 {
-  return MAX (cris_preferred_mininum_alignment (), basic_align);
+  return MAX (cris_preferred_minimum_alignment (), basic_align);
 }
 
 #if 0


[PATCH] i386: Fix a typo in comments for for "Yd"

2019-02-10 Thread H.J. Lu
config/i386/constraints.md has

(define_register_constraint "Yd"
 "TARGET_AVX512DQ ? ALL_SSE_REGS : TARGET_SSE4_1 ? SSE_REGS : NO_REGS"
 "@internal Any EVEX encodable SSE register (@code{%xmm0-%xmm31}) for AVX512DQ 
target or any SSE register for SSE4_1 target.")

Comments for "Yd" should mention AVX512DQ, not AVX512BW.

* config/i386/constraints.md (Yd): Replace AVX512BW with AVX512DQ
in comments
---
 gcc/config/i386/constraints.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md
index 33921aea267..16075b4acf3 100644
--- a/gcc/config/i386/constraints.md
+++ b/gcc/config/i386/constraints.md
@@ -96,7 +96,7 @@
 
 ;; We use the Y prefix to denote any number of conditional register sets:
 ;;  z  First SSE register.
-;;  d  any EVEX encodable SSE register for AVX512BW target or
+;;  d  any EVEX encodable SSE register for AVX512DQ target or
 ;; any SSE register for SSE4_1 target.
 ;;  p  Integer register when TARGET_PARTIAL_REG_STALL is disabled
 ;;  a  Integer register when zero extensions with AND are disabled
-- 
2.20.1



Re: Follow-up-fix to "[PATCH] Move PR84877 fix elsewhere (PR bootstrap/88450)"

2019-02-10 Thread Hans-Peter Nilsson
> Date: Mon, 11 Feb 2019 02:05:11 +0100
> From: Hans-Peter Nilsson 

> Regtested on cris-elf, where it "introduces" gcc.dg/pr84877.c

Correction: "no regressions" (not introduced by this proposed
patch, I misread).

brgds, H-P


Follow-up-fix 2 to "[PATCH] Move PR84877 fix elsewhere (PR bootstrap/88450)"

2019-02-10 Thread Hans-Peter Nilsson
Here's the follow-up, getting rid of the observed
alignment-padding in execute/930126-1.c: the x parameter in f
spuriously being runtime-aligned to BITS_PER_WORD.  I separated
this change because this is an older issue, a change introduced
in r94104 where BITS_PER_WORD was chosen perhaps because we
expect register-sized writes into this area.  Here, we instead
align to a minimum of PREFERRED_STACK_BOUNDARY, but of course
gated on !  STRICT_ALIGNMENT.

Regtested cris-elf and x86_64-pc-linux-gnu.

Ok to commit?

gcc:
* function.c (assign_parm_setup_block): If not STRICT_ALIGNMENT,
instead of always BITS_PER_WORD, align the stacked
parameter to a minimum PREFERRED_STACK_BOUNDARY.

--- function.c.orig2Sat Feb  9 00:53:17 2019
+++ function.c  Sat Feb  9 23:21:35 2019
@@ -2912,7 +2912,10 @@ assign_parm_setup_block (struct assign_p
   size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
   if (stack_parm == 0)
 {
-  SET_DECL_ALIGN (parm, MAX (DECL_ALIGN (parm), BITS_PER_WORD));
+  HOST_WIDE_INT min_parm_align
+   = STRICT_ALIGNMENT ? BITS_PER_WORD : PREFERRED_STACK_BOUNDARY;
+
+  SET_DECL_ALIGN (parm, MAX (DECL_ALIGN (parm), min_parm_align));
   if (DECL_ALIGN (parm) > MAX_SUPPORTED_STACK_ALIGNMENT)
{
  rtx allocsize = gen_int_mode (size_stored, Pmode);

brgds, H-P


Follow-up-fix to "[PATCH] Move PR84877 fix elsewhere (PR bootstrap/88450)"

2019-02-10 Thread Hans-Peter Nilsson
> Date: Thu, 10 Jan 2019 00:06:01 +0100
> From: Jakub Jelinek 

> 2019-01-09  Jakub Jelinek  
> 
>   PR middle-end/84877
>   PR bootstrap/88450
>   * function.c (assign_stack_local_1): Revert the 2018-11-21 changes.
>   (assign_parm_setup_block): Do the argument slot realignment here
>   instead.

When this was committed as r267812, results for cris-elf went
from regress-8 to regress-160 (now at r268749, regress-154).


I analyzed one of the simpler cases,
gcc.c-torture/execute/930126-1.c at -O0.

It looks like the code added in r267812 doesn't handle targets
where MAX_SUPPORTED_STACK_ALIGNMENT is less than BITS_PER_WORD
(by necessity, non-STRICT_ALIGNMENT targets); the bug is in not
adding necessary alignment to the allocated space.

cris-elf is a non-STRICT_ALIGNMENT target and
MAX_SUPPORTED_STACK_ALIGNMENT is defaulted from STACK_BOUNDARY,
16 (bits).

The only difference with r267812 is that before, 10 bytes
(excluding 4 for the saved frame-pointer-address) was allocated
for the stack-frame of the function f and at r267812, 8 bytes;
the literally only differences are two instructions, one when
allocating stack and one when forming the pointer to the
stack-parameter.  The sizeof the actual object is 5 bytes, but
padded to 8 bytes due to copying from incoming registers (which
is IMO a valid reason, maybe not worthwhile to improve).  Please
remember that size-padding is different from alignment-padding.

The setting of DECL_ALIGN (parm) to BITS_PER_WORD (originating
at r94104) is wrong too, but not fatal.  I'll fix that in a
follow-up change and let's pretend for the moment that there's a
valid reason for aligning "parm" in this case (for example, a
user-provided overalignment).

At function entry, after saving the frame-pointer-register, the
stack-pointer is 0x3dfffece (both versions).  The parameter-
pointer-align instructions align this to a multiple of 4 for
both versions, and this of course fails as there's no padding
with r267812.  The actual failure is due to overwriting the
saved frame-pointer-register, with the wrong value then used in
main to verify the result.


To fix the r267812 incorrectness we need to use the *stored*
size, i.e. that which will be stored into using register-sized
writes.  It's seems like the bug is just a typo, so the fix is
as simple as follows.  Note the usage of "diff -U 10" to show
that size_stored is used in the "then"-arm.

Regtested on cris-elf, where it "introduces" gcc.dg/pr84877.c
but on inspection that's a known bug with a bit of hairy churn,
reverts and all.  See the PR.  This patch has no bearing on that
PR; this is for incoming parameters, and PR84877 is for (not
doing) alignment of pass-by-reference parameters for outgoing
parameters.  Still, maybe the solution to that PR should have
code like in that the context below...which I see you already
noted, in the patch-message to which this is a reply.

Also regtested on x86_64-pc-linux-gnu (without regressions)
together with the followup-change.

Ok to commit?

gcc/ChangeLog:
* function.c (assign_parm_setup_block): Use the stored
size, not the passed size, when allocating stack-space,
also for a parameter with alignment larger than
MAX_SUPPORTED_STACK_ALIGNMENT.

both arms of the containing "if".)
--- gcc/function.c.orig Sun Jan 20 19:20:16 2019
+++ gcc/function.c  Sat Feb  9 00:53:17 2019
@@ -2907,23 +2907,23 @@ assign_parm_setup_block (struct assign_p
}
   data->stack_parm = NULL;
 }
 
   size = int_size_in_bytes (data->passed_type);
   size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
   if (stack_parm == 0)
 {
   SET_DECL_ALIGN (parm, MAX (DECL_ALIGN (parm), BITS_PER_WORD));
   if (DECL_ALIGN (parm) > MAX_SUPPORTED_STACK_ALIGNMENT)
{
- rtx allocsize = gen_int_mode (size, Pmode);
+ rtx allocsize = gen_int_mode (size_stored, Pmode);
  get_dynamic_stack_size (, 0, DECL_ALIGN (parm), NULL);
  stack_parm = assign_stack_local (BLKmode, UINTVAL (allocsize),
   MAX_SUPPORTED_STACK_ALIGNMENT);
  rtx addr = align_dynamic_address (XEXP (stack_parm, 0),
DECL_ALIGN (parm));
  mark_reg_pointer (addr, DECL_ALIGN (parm));
  stack_parm = gen_rtx_MEM (GET_MODE (stack_parm), addr);
  MEM_NOTRAP_P (stack_parm) = 1;
}
   else
stack_parm = assign_stack_local (BLKmode, size_stored,

brgds, H-P


Re: [PATCH 12/43] i386: Emulate MMX vec_dupv2si with SSE

2019-02-10 Thread H.J. Lu
On Sun, Feb 10, 2019 at 1:49 PM Uros Bizjak  wrote:
>
> On Sun, Feb 10, 2019 at 10:45 PM Uros Bizjak  wrote:
>
> > > > > +  [(const_int 0)]
> > > > > +{
> > > > > +  /* Emulate MMX vec_dupv2si with SSE vec_dupv4si.  */
> > > > > +  rtx op0 = gen_rtx_REG (V4SImode, REGNO (operands[0]));
> > > > > +  rtx insn = gen_vec_dupv4si (op0, operands[1]);
> > > > > +  emit_insn (insn);
> > > > > +  DONE;
> > > >
> > > > Please write this simple RTX explicitly in the place of (const_int 0) 
> > > > above.
> > >
> > > rtx insn = gen_vec_dupv4si (op0, operands[1]);
> > >
> > > is easy.   How do I write
> > >
> > > rtx op0 = gen_rtx_REG (V4SImode, REGNO (operands[0]));
> > >
> > > in place of  (const_int 0)?
> >
> >   [(set (match_dup 2)
> > (vec_duplicate:V4SI (match_dup 1)))]
> >
> > with
> >
> > "operands[2] = gen_rtx_REG (V4SImode, REGNO (operands[0]));"
> >
> > or even better:
> >
> > "operands[2] = gen_lowpart (V4SImode, operands[0]);"
> >
> > in the preparation statement.
>
> Even shorter is
>
> "operands[0] = gen_lowpart (V4SImode, operands[0]);"
>
> and use (match_dup 0) instead of (match_dup 2) in the RTX.
>
> There is plenty of examples throughout sse.md.
>

This works:

(define_insn_and_split "*vec_dupv2si"
  [(set (match_operand:V2SI 0 "register_operand" "=y,x,Yv")
(vec_duplicate:V2SI
  (match_operand:SI 1 "register_operand" "0,0,Yv")))]
  "TARGET_MMX || TARGET_MMX_WITH_SSE"
  "@
   punpckldq\t%0, %0
   #
   #"
  "TARGET_MMX_WITH_SSE && reload_completed"
  [(set (match_dup 0)
(vec_duplicate:V4SI (match_dup 1)))]
  "operands[0] = gen_rtx_REG (V4SImode, REGNO (operands[0]));"
  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
   (set_attr "type" "mmxcvt,ssemov,ssemov")
   (set_attr "mode" "DI,TI,TI")])

Thanks.

-- 
H.J.


Re: [RS6000] Don't support inline PLT for ABI_V4 bss-plt

2019-02-10 Thread Alan Modra
On Fri, Feb 08, 2019 at 04:53:44PM -0600, Segher Boessenkool wrote:
> > @@ -37981,7 +37982,7 @@ rs6000_call_sysv (rtx value, rtx func_desc, rtx 
> > tlsarg, rtx cookie)
> >func = rs6000_longcall_ref (func_desc, tlsarg);
> >/* If the longcall was implemented using PLT16 relocs, then r11
> >  needs to be valid at the call for lazy linking.  */
> 
> This comment could use some work.

True.  I've rectified that and combined this patch with the
-mno-pltseq patch, documented as requested and enhanced to emit errors
and warnings when invalid options are combined.  The two patches go
together, because someone building their ppc32 compiler with
--enable-secureplt but having old -mbss-plt relocatable objects on
their system will find that linking new -msecure-plt -mlongcall
objects against -mbss-plt objects will fail.  One solution to that
problem is to compile with -mbss-plt whenever -mlongcall is needed,
but -mlongcall -mno-pltseq provides a way to transition everything to
-msecure-plt objects.

Bootstrapped and regression tested powerpc64le-linux and
powerpc64-linux biarch.  I also built an rs6000-aix7.2 cross compiler
(well, cc1 and cc1plus, I don't have the aix headers to build libgcc)
to ensure the patch didn't introduce undefined references in powerpc 
targets not using sysv4.h.  OK for mainline?

I'd also like to fix the formatting in linux64.h
SUBSUBTARGET_OVERRIDE_OPTIONS by moving all the continuation
backslashes one tab stop to the right when I commit this patch.  Is
that OK too?

* doc/invoke.texi (man page RS/6000 and PowerPC Options): Mention
-mlongcall and -mpltseq.
(RS/6000 and PowerPC Options <-mlongcall>): Mention inline PLT calls.
(RS/6000 and PowerPC Options <-mpltseq>): Document.
* config/rs6000/rs6000.h (TARGET_PLTSEQ): Define.
* config/rs6000/sysv4.opt (mpltseq): New option.
* config/rs6000/sysv4.h (TARGET_PLTSEQ): Redefine.
(SUBTARGET_OVERRIDE_OPTIONS): Error if given -mpltseq when assembler
support is lacking.  Don't allow -mpltseq with -mbss-plt.
* config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Warn if
-mpltseq given for ELFv1.
* config/rs6000/rs6000.c (rs6000_call_aix): Comment on UNSPEC_PLTSEQ.
Only use UNSPEC_PLTSEQ for inline PLT calls.
(rs6000_call_sysv, rs6000_sibcall_sysv): Expand comments.  Only
use UNSPEC_PLTSEQ for inline PLT calls.
(rs6000_indirect_call_template_1, rs6000_longcall_ref),
(rs6000_call_aix, rs6000_call_sysv, rs6000_sibcall_sysv): Replace
uses of HAVE_AS_PLTSEQ with TARGET_PLTSEQ, simplifying.
* config/rs6000/rs6000.md (pltseq_tocsave_),
(pltseq_plt16_ha_, pltseq_plt16_lo_),
(pltseq_mtctr_): Likewise.

diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h
index 29e9afa7f3d..df1d8a9f45a 100644
--- a/gcc/config/rs6000/linux64.h
+++ b/gcc/config/rs6000/linux64.h
@@ -155,6 +155,13 @@ extern int dot_symbols;
TARGET_NO_SUM_IN_TOC = 0;   \
}   \
}   \
+ if (TARGET_PLTSEQ && DEFAULT_ABI != ABI_ELFv2)\
+   {   \
+ if (global_options_set.x_rs6000_pltseq)   \
+   warning (0, "%qs unsupported for this ABI", \
+"-mpltseq");   \
+ rs6000_pltseq = false;\
+   }   \
}   \
   else \
{   \
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index b4ff18d414c..99f04bba148 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -21665,7 +21665,7 @@ rs6000_indirect_call_template_1 (rtx *operands, 
unsigned int funop,
|| (REG_P (operands[funop])
&& REGNO (operands[funop]) == LR_REGNO));
 
-  if (!TARGET_MACHO && HAVE_AS_PLTSEQ && GET_CODE (operands[funop]) == UNSPEC)
+  if (!TARGET_MACHO && TARGET_PLTSEQ && GET_CODE (operands[funop]) == UNSPEC)
 {
   const char *rel64 = TARGET_64BIT ? "64" : "";
   char tls[29];
@@ -32827,8 +32827,7 @@ rs6000_longcall_ref (rtx call_ref, rtx arg)
   call_ref = gen_rtx_SYMBOL_REF (VOIDmode, IDENTIFIER_POINTER (node));
 }
 
-  if (HAVE_AS_PLTSEQ
-  && (DEFAULT_ABI == ABI_ELFv2 || DEFAULT_ABI == ABI_V4))
+  if (TARGET_PLTSEQ)
 {
   rtx base = const0_rtx;
   int regno;
@@ -37793,14 +37792,20 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx 
tlsarg, rtx cookie)
   rtx call[4];
   int n_call;
   rtx insn;
+  bool is_pltseq_longcall;
 
   if (global_tlsarg)
 

Re: [PATCH 2/2, d] Enable tests for rt.util.typeinfo and core.internal.convert

2019-02-10 Thread Iain Buclaw
On Wed, 28 Nov 2018 at 22:44, Johannes Pfau  wrote:
>
> With these backports, these tests now pass for GDC and we don't
> need the special cases in the Makefiles anymore.
>
> --
> Johannes
>
> ---
> libphobos/ChangeLog:
>
> 2018-11-28  Johannes Pfau  
>
> * libdruntime/Makefile.am: Test rt.util.typeinfo and 
> core.internal.convert.
> * libdruntime/Makefile.in: Rebuild.
>

Thanks, committed this in r268755 after doing the initial backport of
core.internal.hash.

-- 
Iain


[PATCH, libphobos] Committed fix for hashing complex reals

2019-02-10 Thread Iain Buclaw
Hi,

It is a rebase of a patch sent to this mailing list by Johannes, it
has been committed to upstream druntime, and now downstreaming.

Bootstrapped and regression tested on x86_64-linux-gnu.

Committed to trunk as r268755

-- 
Iain
---
libphobos/ChangeLog:

2019-02-10  Iain Buclaw  

* Makefile.in: Rebuild.
* configure: Rebuild.
* libdruntime/Makefile.am: Test rt.util.typeinfo and
core.internal.convert.
* libdruntime/Makefile.in: Rebuild.
* src/Makefile.in: Rebuild.
* testsuite/Makefile.in: Rebuild.
* testsuite/libphobos.hash/test_hash.d: Update test.
---
diff --git a/libphobos/Makefile.in b/libphobos/Makefile.in
index 3059196d75a..87eaf28aba7 100644
--- a/libphobos/Makefile.in
+++ b/libphobos/Makefile.in
@@ -15,7 +15,7 @@
 @SET_MAKE@
 
 # Makefile for the toplevel directory of the D Standard library.
-# Copyright (C) 2006-2018 Free Software Foundation, Inc.
+# Copyright (C) 2006-2019 Free Software Foundation, Inc.
 #
 # GCC is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@@ -319,7 +319,6 @@ phobos_compiler_shared_flag = @phobos_compiler_shared_flag@
 prefix = @prefix@
 program_transform_name = @program_transform_name@
 psdir = @psdir@
-runstatedir = @runstatedir@
 sbindir = @sbindir@
 sharedstatedir = @sharedstatedir@
 srcdir = @srcdir@
diff --git a/libphobos/configure b/libphobos/configure
index d247d9adc1f..9f96ad5d190 100755
--- a/libphobos/configure
+++ b/libphobos/configure
@@ -3,7 +3,7 @@
 # Generated by GNU Autoconf 2.69 for package-unused version-unused.
 #
 #
-# Copyright (C) 1992-1996, 1998-2012 Free Software Foundation, Inc.
+# Copyright (C) 1992-2019 Free Software Foundation, Inc.
 #
 #
 # This configure script is free software; the Free Software Foundation
@@ -782,7 +782,6 @@ infodir
 docdir
 oldincludedir
 includedir
-runstatedir
 localstatedir
 sharedstatedir
 sysconfdir
@@ -868,7 +867,6 @@ datadir='${datarootdir}'
 sysconfdir='${prefix}/etc'
 sharedstatedir='${prefix}/com'
 localstatedir='${prefix}/var'
-runstatedir='${localstatedir}/run'
 includedir='${prefix}/include'
 oldincludedir='/usr/include'
 docdir='${datarootdir}/doc/${PACKAGE_TARNAME}'
@@ -1121,15 +1119,6 @@ do
   | -silent | --silent | --silen | --sile | --sil)
 silent=yes ;;
 
-  -runstatedir | --runstatedir | --runstatedi | --runstated \
-  | --runstate | --runstat | --runsta | --runst | --runs \
-  | --run | --ru | --r)
-ac_prev=runstatedir ;;
-  -runstatedir=* | --runstatedir=* | --runstatedi=* | --runstated=* \
-  | --runstate=* | --runstat=* | --runsta=* | --runst=* | --runs=* \
-  | --run=* | --ru=* | --r=*)
-runstatedir=$ac_optarg ;;
-
   -sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb)
 ac_prev=sbindir ;;
   -sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \
@@ -1267,7 +1256,7 @@ fi
 for ac_var in	exec_prefix prefix bindir sbindir libexecdir datarootdir \
 		datadir sysconfdir sharedstatedir localstatedir includedir \
 		oldincludedir docdir infodir htmldir dvidir pdfdir psdir \
-		libdir localedir mandir runstatedir
+		libdir localedir mandir
 do
   eval ac_val=\$$ac_var
   # Remove trailing slashes.
@@ -1420,7 +1409,6 @@ Fine tuning of the installation directories:
   --sysconfdir=DIRread-only single-machine data [PREFIX/etc]
   --sharedstatedir=DIRmodifiable architecture-independent data [PREFIX/com]
   --localstatedir=DIR modifiable single-machine data [PREFIX/var]
-  --runstatedir=DIR   modifiable per-process data [LOCALSTATEDIR/run]
   --libdir=DIRobject code libraries [EPREFIX/lib]
   --includedir=DIRC header files [PREFIX/include]
   --oldincludedir=DIR C header files for non-gcc [/usr/include]
@@ -1580,7 +1568,7 @@ if $ac_init_version; then
 package-unused configure version-unused
 generated by GNU Autoconf 2.69
 
-Copyright (C) 2012 Free Software Foundation, Inc.
+Copyright (C) 2012-2019 Free Software Foundation, Inc.
 This configure script is free software; the Free Software Foundation
 gives unlimited permission to copy, distribute and modify it.
 _ACEOF
@@ -11508,7 +11496,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 11511 "configure"
+#line 11499 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -11614,7 +11602,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 11617 "configure"
+#line 11605 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -15578,7 +15566,7 @@ package-unused config.status version-unused
 configured by $0, generated by GNU Autoconf 2.69,
   with options \\"\$ac_cs_config\\"
 
-Copyright (C) 2012 Free Software Foundation, Inc.
+Copyright (C) 2012-2019 Free Software Foundation, Inc.
 This config.status script is free software; the Free Software Foundation
 gives 

Re: Fix odr ICE on Ada LTO

2019-02-10 Thread Jan Hubicka
Hi,
I am attaching correct patch.
The option is new only in a relative sense - it was added 5 years ago
with the orinal ODR warning infrastructure.  
We have -Wodr-type-merging that controls streming data needed for -Wodr
to work and -fno-devirtualize that controls streaming of BINFOs.

I was concerned at that time about extra overhead this streaming causes,
but with all the optimizations this overhead is quite small now (i.e.
the mangled type names and there are "only" about 4k types in Firefox)

What is anoying about -Wno-odr-type-merging is that we lose mangled
names that are also used by devirtualization. ipa-devirt still has two
implementations of the main hash - one based on mangled names and the
original one based on virtual table names, but combining both hashes
results in incomplete type inheritance graphs.

Honza


PR lto/89272
* tree.c (fld_simplified_type_name): Also keep TYPE_DECL for
polymorphic types.


--- trunk/gcc/tree.c2019/02/10 09:45:55 268741
+++ trunk/gcc/tree.c2019/02/10 10:46:43 268742
@@ -5153,7 +5153,10 @@
  TYPE_DECL if the type doesn't have linkage.
  this must match fld_  */
   if (type != TYPE_MAIN_VARIANT (type)
-  || !DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (type)))
+  || (!DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (type))
+ && (TREE_CODE (type) != RECORD_TYPE
+ || !TYPE_BINFO (type)
+ || !BINFO_VTABLE (TYPE_BINFO (type)
 return DECL_NAME (TYPE_NAME (type));
   return TYPE_NAME (type);
 }



Re: [PATCH] rs6000: Vector shift-right should honor modulo semantics

2019-02-10 Thread Segher Boessenkool
Hi Bill,

On Sun, Feb 10, 2019 at 10:10:02AM -0600, Bill Schmidt wrote:
> I've added executable tests for both shift-right algebraic and shift-right 
> logical.
> Both fail prior to applying the patch, and work correctly afterwards.

Please add a test for left shifts, as well?

> 2019-02-08  Bill Schmidt  
> 
>   * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Shift-right
>   and shift-left vector built-ins need to include a TRUNC_MOD_EXPR
>   for correct semantics.  Also, don't expand a vector-splat if there
>   is a type mismatch; let the back end handle it.

Does it always result in just the shift instruction now?  Does the modulo
ever remain?  (Maybe at -O0?)  Modulo is hugely expensive; it is always
modulo by a power of two, so a simple bitmask, so maybe write that directly
instead?

> 2019-02-08  Bill Schmidt  
> 
>   * gcc.target/powerpc/srad-modulo.c: New.
>   * gcc.target/powerpc/srd-modulo.c: New.

Please put "vec-" in the testcase name.  You may need to rename vec-shift.c
and/or vec-shr.c, which are not as generic as their names suggest.

> @@ -16072,6 +16116,13 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *
>   arg0 = gimple_call_arg (stmt, 0);
>   lhs = gimple_call_lhs (stmt);
>  
> + /* It's sometimes useful to use one of these to build a
> +splat for V2DImode, since the upper bits will be ignored.
> +Don't fold if we detect that situation, as we end up
> +losing the splat instruction in this case.  */
> + if (size != TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (TREE_TYPE (lhs)
> +   return false;

This isn't really detecting that situation...  Instead, it doesn't fold
whenever the size of the splatted elements isn't the same as the size of
the elts of the target vector.  That's probably perfectly safe, but please
spell it out.  It's fine to mention the motivating case, of course.

> Index: gcc/testsuite/gcc.target/powerpc/srad-modulo.c
> ===
> --- gcc/testsuite/gcc.target/powerpc/srad-modulo.c(nonexistent)
> +++ gcc/testsuite/gcc.target/powerpc/srad-modulo.c(working copy)
> @@ -0,0 +1,43 @@
> +/* Test that using a character splat to set up a shift-right algebraic
> +   for a doubleword vector works correctly after gimple folding.  */
> +
> +/* { dg-do run { target { powerpc64*-*-* && vsx_hw } } } */
> +/* { dg-options { "-O3" } } */

powerpc64*-*-* is almost always wrong.  I don't think you need to limit
to 64-bit at all here, but if you do, test for lp64 instead.

Testing for vsx_hw but not enabling vsx is probably wrong, too.

Does it need -O3, does -O2 not work?

Should this testcase check expected machine code as well?

> --- gcc/testsuite/gcc.target/powerpc/srd-modulo.c (nonexistent)
> +++ gcc/testsuite/gcc.target/powerpc/srd-modulo.c (working copy)

> +vui64_t
> +test_sradi_4 (vui64_t a)
> +{
> +  return vec_sradi (a, 4);
> +}

Pasto?  (srad vs. srd).


Segher


Re: [PATCH 12/43] i386: Emulate MMX vec_dupv2si with SSE

2019-02-10 Thread Uros Bizjak
On Sun, Feb 10, 2019 at 10:45 PM Uros Bizjak  wrote:

> > > > +  [(const_int 0)]
> > > > +{
> > > > +  /* Emulate MMX vec_dupv2si with SSE vec_dupv4si.  */
> > > > +  rtx op0 = gen_rtx_REG (V4SImode, REGNO (operands[0]));
> > > > +  rtx insn = gen_vec_dupv4si (op0, operands[1]);
> > > > +  emit_insn (insn);
> > > > +  DONE;
> > >
> > > Please write this simple RTX explicitly in the place of (const_int 0) 
> > > above.
> >
> > rtx insn = gen_vec_dupv4si (op0, operands[1]);
> >
> > is easy.   How do I write
> >
> > rtx op0 = gen_rtx_REG (V4SImode, REGNO (operands[0]));
> >
> > in place of  (const_int 0)?
>
>   [(set (match_dup 2)
> (vec_duplicate:V4SI (match_dup 1)))]
>
> with
>
> "operands[2] = gen_rtx_REG (V4SImode, REGNO (operands[0]));"
>
> or even better:
>
> "operands[2] = gen_lowpart (V4SImode, operands[0]);"
>
> in the preparation statement.

Even shorter is

"operands[0] = gen_lowpart (V4SImode, operands[0]);"

and use (match_dup 0) instead of (match_dup 2) in the RTX.

There is plenty of examples throughout sse.md.

Uros.


Re: [PATCH 12/43] i386: Emulate MMX vec_dupv2si with SSE

2019-02-10 Thread Uros Bizjak
On Sun, Feb 10, 2019 at 10:01 PM H.J. Lu  wrote:
>
> On Sun, Feb 10, 2019 at 2:36 AM Uros Bizjak  wrote:
> >
> > On 2/10/19, H.J. Lu  wrote:
> > > Emulate MMX vec_dupv2si with SSE.  Only SSE register source operand is
> > > allowed.
> > >
> > >   PR target/89021
> > >   * config/i386/mmx.md (*vec_dupv2si): Changed to
> > >   define_insn_and_split and also allow TARGET_MMX_WITH_SSE to
> > >   support SSE emulation.
> > >   * config/i386/sse.md (*vec_dupv4si): Renamed to ...
> > >   (vec_dupv4si): This.
> > > ---
> > >  gcc/config/i386/mmx.md | 27 ---
> > >  gcc/config/i386/sse.md |  2 +-
> > >  2 files changed, 21 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> > > index d360e97c98b..1ee51c5deb7 100644
> > > --- a/gcc/config/i386/mmx.md
> > > +++ b/gcc/config/i386/mmx.md
> > > @@ -1420,14 +1420,27 @@
> > > (set_attr "length_immediate" "1")
> > > (set_attr "mode" "DI")])
> > >
> > > -(define_insn "*vec_dupv2si"
> > > -  [(set (match_operand:V2SI 0 "register_operand" "=y")
> > > +(define_insn_and_split "*vec_dupv2si"
> > > +  [(set (match_operand:V2SI 0 "register_operand" "=y,x,Yv")
> > >   (vec_duplicate:V2SI
> > > -   (match_operand:SI 1 "register_operand" "0")))]
> > > -  "TARGET_MMX"
> > > -  "punpckldq\t%0, %0"
> > > -  [(set_attr "type" "mmxcvt")
> > > -   (set_attr "mode" "DI")])
> > > +   (match_operand:SI 1 "register_operand" "0,0,Yv")))]
> > > +  "TARGET_MMX || TARGET_MMX_WITH_SSE"
> > > +  "@
> > > +   punpckldq\t%0, %0
> > > +   #
> > > +   #"
> > > +  "&& reload_completed && TARGET_MMX_WITH_SSE"
> >
> > Please fix above.
>
> I will use
>
> "TARGET_MMX_WITH_SSE && reload_completed"
>
> > > +  [(const_int 0)]
> > > +{
> > > +  /* Emulate MMX vec_dupv2si with SSE vec_dupv4si.  */
> > > +  rtx op0 = gen_rtx_REG (V4SImode, REGNO (operands[0]));
> > > +  rtx insn = gen_vec_dupv4si (op0, operands[1]);
> > > +  emit_insn (insn);
> > > +  DONE;
> >
> > Please write this simple RTX explicitly in the place of (const_int 0) above.
>
> rtx insn = gen_vec_dupv4si (op0, operands[1]);
>
> is easy.   How do I write
>
> rtx op0 = gen_rtx_REG (V4SImode, REGNO (operands[0]));
>
> in place of  (const_int 0)?

  [(set (match_dup 2)
(vec_duplicate:V4SI (match_dup 1)))]

with

"operands[2] = gen_rtx_REG (V4SImode, REGNO (operands[0]));"

or even better:

"operands[2] = gen_lowpart (V4SImode, operands[0]);"

in the preparation statement.

Uros.


Re: [PATCH 1/2, d] Fix hashing of complex reals

2019-02-10 Thread Iain Buclaw
On Thu, 29 Nov 2018 at 11:51, Iain Buclaw  wrote:
>
> On Wed, 28 Nov 2018 at 22:44, Johannes Pfau  wrote:
> >
> > Hashing of complex types where the floating point type used
> > for the real and imaginary parts has padding (such as X86 80 bit reals)
> > has padding, is currently broken in druntime.
> >
> > Fixed by backporting https://github.com/dlang/druntime/pull/2356
> > from druntime commit 29ce0543cb62229f005b2bc8540416dbccd1130e
> >
> > Tested at https://github.com/D-Programming-GDC/GDC/pull/768
> >
> > --
> > Johannes
> >
> > ---
> > libphobos/ChangeLog:
> >
> > 2018-11-28  Johannes Pfau  
> >
> > * libdruntime/core/internal/convert.d: Backport from latest 
> > druntime.
> > * libdruntime/core/internal/hash.d: Likewise.
> > * libdruntime/core/internal/traits.d: Likewise.
> > * libdruntime/rt/util/typeinfo.d: Likewise.
> >
> >  libphobos/libdruntime/core/internal/convert.d |  136 ++-
> >  libphobos/libdruntime/core/internal/hash.d| 1044 +++--
> >  libphobos/libdruntime/core/internal/traits.d  |   19 +
> >  libphobos/libdruntime/rt/util/typeinfo.d  |   33 +-
> >  4 files changed, 815 insertions(+), 417 deletions(-)
> >
>
> I had a quick look at the associated druntime PRs, and this looks like
> we're only selectively applying many partial patches.  It would be
> better to apply each dependent patch one at a time, so we don't have a
> half complete backport.
>
> These would be for instance PRs 2197, 2202, 2210, 2200, 2227, 2209,
> 2198, 2243, 2240, 2246, 2311 - I stopped here but there are a few more
> to catch up with the internal/hash implementation, and maybe a few
> more in-between that I didn't spot.
>
> This would make transition from this 2.076+backports to 2.08x or 2.09x
> a little simpler, and we can test these for any problems ahead of
> time.
>

I committed the first part of this as outlined in my suggestion in r268754.

-- 
Iain


Re: [PATCH 12/43] i386: Emulate MMX vec_dupv2si with SSE

2019-02-10 Thread H.J. Lu
On Sun, Feb 10, 2019 at 2:36 AM Uros Bizjak  wrote:
>
> On 2/10/19, H.J. Lu  wrote:
> > Emulate MMX vec_dupv2si with SSE.  Only SSE register source operand is
> > allowed.
> >
> >   PR target/89021
> >   * config/i386/mmx.md (*vec_dupv2si): Changed to
> >   define_insn_and_split and also allow TARGET_MMX_WITH_SSE to
> >   support SSE emulation.
> >   * config/i386/sse.md (*vec_dupv4si): Renamed to ...
> >   (vec_dupv4si): This.
> > ---
> >  gcc/config/i386/mmx.md | 27 ---
> >  gcc/config/i386/sse.md |  2 +-
> >  2 files changed, 21 insertions(+), 8 deletions(-)
> >
> > diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> > index d360e97c98b..1ee51c5deb7 100644
> > --- a/gcc/config/i386/mmx.md
> > +++ b/gcc/config/i386/mmx.md
> > @@ -1420,14 +1420,27 @@
> > (set_attr "length_immediate" "1")
> > (set_attr "mode" "DI")])
> >
> > -(define_insn "*vec_dupv2si"
> > -  [(set (match_operand:V2SI 0 "register_operand" "=y")
> > +(define_insn_and_split "*vec_dupv2si"
> > +  [(set (match_operand:V2SI 0 "register_operand" "=y,x,Yv")
> >   (vec_duplicate:V2SI
> > -   (match_operand:SI 1 "register_operand" "0")))]
> > -  "TARGET_MMX"
> > -  "punpckldq\t%0, %0"
> > -  [(set_attr "type" "mmxcvt")
> > -   (set_attr "mode" "DI")])
> > +   (match_operand:SI 1 "register_operand" "0,0,Yv")))]
> > +  "TARGET_MMX || TARGET_MMX_WITH_SSE"
> > +  "@
> > +   punpckldq\t%0, %0
> > +   #
> > +   #"
> > +  "&& reload_completed && TARGET_MMX_WITH_SSE"
>
> Please fix above.

I will use

"TARGET_MMX_WITH_SSE && reload_completed"

> > +  [(const_int 0)]
> > +{
> > +  /* Emulate MMX vec_dupv2si with SSE vec_dupv4si.  */
> > +  rtx op0 = gen_rtx_REG (V4SImode, REGNO (operands[0]));
> > +  rtx insn = gen_vec_dupv4si (op0, operands[1]);
> > +  emit_insn (insn);
> > +  DONE;
>
> Please write this simple RTX explicitly in the place of (const_int 0) above.

rtx insn = gen_vec_dupv4si (op0, operands[1]);

is easy.   How do I write

rtx op0 = gen_rtx_REG (V4SImode, REGNO (operands[0]));

in place of  (const_int 0)?


> Uros.
>
> > +}
> > +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> > +   (set_attr "type" "mmxcvt,ssemov,ssemov")
> > +   (set_attr "mode" "DI,TI,TI")])
> >
> >  (define_insn "*mmx_concatv2si"
> >[(set (match_operand:V2SI 0 "register_operand" "=y,y")
> > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> > index 5dc0930ac1f..7d2c0367911 100644
> > --- a/gcc/config/i386/sse.md
> > +++ b/gcc/config/i386/sse.md
> > @@ -18976,7 +18976,7 @@
> > (set_attr "prefix" "maybe_evex,maybe_evex,orig")
> > (set_attr "mode" "V4SF")])
> >
> > -(define_insn "*vec_dupv4si"
> > +(define_insn "vec_dupv4si"
> >[(set (match_operand:V4SI 0 "register_operand" "=v,v,x")
> >   (vec_duplicate:V4SI
> > (match_operand:SI 1 "nonimmediate_operand" "Yv,m,0")))]
> > --
> > 2.20.1
> >
> >



-- 
H.J.


Re: Make clear, when contributions will be ignored

2019-02-10 Thread Segher Boessenkool
Hi Dilyan,

On Sun, Feb 10, 2019 at 02:45:02PM +, Дилян Палаузов wrote:
> Do you share the opinion, that whatever can be done after receiving a 
> reminder, can be arranged also without reminder? 

Yes.  When people have time for it, they can trivially check what PRs are
still open that they are involved in.

> If yes, how do you propose to proceed, so that a 
> no-reminders-are-necessary-state is reached?

Keep things as is?  Reminders already are not necessary.

If you want more attention given to the bugs you are involved in, you can
hire people to do that, or file reports for more interesting bugs, or make
your bug reports easier to work with.

Since GCC has one major release every year, handling less urgent bugs can
take up to a year as well.

> I read in the answer of Segher, that the purpose of reminding is not only to 
> ping, but also to filter the ones who are
> pernetrant and sending manually reminders is the means to verify, that the 
> persons really want to make progress.  It was
> certainly not intentionally meant this way, but this is a possible reading.

The point is that automated reminders for PRs *are spam*.


Segher


Re: [PATCH 08/43] i386: Emulate MMX ashr3/3 with SSE

2019-02-10 Thread Uros Bizjak
On Sun, Feb 10, 2019 at 9:38 PM H.J. Lu  wrote:
>
> On Sun, Feb 10, 2019 at 2:26 AM Uros Bizjak  wrote:
> >
> > On 2/10/19, H.J. Lu  wrote:
> > > Emulate MMX ashr3/3 with SSE.  Only SSE register
> > > source operand is allowed.
> > >
> > >   PR target/89021
> > >   * config/i386/mmx.md (mmx_ashr3): Disallow with
> > >   TARGET_MMX_WITH_SSE.
> > >   (mmx_3): Likewise.
> > >   (ashr3): New.
> > >   (3): Likewise.
> >
> > Please merge patterns use mmx_isa attribute.
>
> Currently, MMX pattern names have a "mmx_" prefix.  For SSE emulation, we
> don't want such a prefix so that the middle-end can detect and use them.  If 
> we
> remove the "mmx_" prefix from MMX pattern names, won't the middle-end
> generate MMX instructions in this case?  It it safe to do so?

I meant to create a merged "*ashr3" pattern, and introduce
ashr3 expander. The intention from maintainability point of view
is for instruction patterns to follow an unified approach as much as
possible, and minimise deviations between patterns of the same family.

Uros.

>
>
> > Uros.
> >
> > > ---
> > >  gcc/config/i386/mmx.md | 38 --
> > >  1 file changed, 36 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> > > index 2024c75fa78..9e07bf31f81 100644
> > > --- a/gcc/config/i386/mmx.md
> > > +++ b/gcc/config/i386/mmx.md
> > > @@ -995,7 +995,7 @@
> > >  (ashiftrt:MMXMODE24
> > > (match_operand:MMXMODE24 1 "register_operand" "0")
> > > (match_operand:DI 2 "nonmemory_operand" "yN")))]
> > > -  "TARGET_MMX"
> > > +  "TARGET_MMX && !TARGET_MMX_WITH_SSE"
> > >"psra\t{%2, %0|%0, %2}"
> > >[(set_attr "type" "mmxshft")
> > > (set (attr "length_immediate")
> > > @@ -1009,7 +1009,7 @@
> > >  (any_lshift:MMXMODE248
> > > (match_operand:MMXMODE248 1 "register_operand" "0")
> > > (match_operand:DI 2 "nonmemory_operand" "yN")))]
> > > -  "TARGET_MMX"
> > > +  "TARGET_MMX && !TARGET_MMX_WITH_SSE"
> > >"p\t{%2, %0|%0, %2}"
> > >[(set_attr "type" "mmxshft")
> > > (set (attr "length_immediate")
> > > @@ -1018,6 +1018,40 @@
> > > (const_string "0")))
> > > (set_attr "mode" "DI")])
> > >
> > > +(define_insn "ashr3"
> > > +  [(set (match_operand:MMXMODE24 0 "register_operand" "=x,Yv")
> > > +(ashiftrt:MMXMODE24
> > > +   (match_operand:MMXMODE24 1 "register_operand" "0,Yv")
> > > +   (match_operand:DI 2 "nonmemory_operand" "xN,YvN")))]
> > > +  "TARGET_MMX_WITH_SSE"
> > > +  "@
> > > +   psra\t{%2, %0|%0, %2}
> > > +   vpsra\t{%2, %1, %0|%0, %1, %2}"
> > > +  [(set_attr "isa" "noavx,avx")
> > > +   (set_attr "type" "sseishft,sseishft")
> > > +   (set (attr "length_immediate")
> > > + (if_then_else (match_operand 2 "const_int_operand")
> > > +   (const_string "1")
> > > +   (const_string "0")))
> > > +   (set_attr "mode" "TI")])
> > > +
> > > +(define_insn "3"
> > > +  [(set (match_operand:MMXMODE248 0 "register_operand" "=x,Yv")
> > > +(any_lshift:MMXMODE248
> > > +   (match_operand:MMXMODE248 1 "register_operand" "0,Yv")
> > > +   (match_operand:DI 2 "nonmemory_operand" "xN,YvN")))]
> > > +  "TARGET_MMX_WITH_SSE"
> > > +  "@
> > > +   p\t{%2, %0|%0, %2}
> > > +   vp\t{%2, %1, %0|%0, %1, %2}"
> > > +  [(set_attr "isa" "noavx,avx")
> > > +   (set_attr "type" "sseishft,sseishft")
> > > +   (set (attr "length_immediate")
> > > + (if_then_else (match_operand 2 "const_int_operand")
> > > +   (const_string "1")
> > > +   (const_string "0")))
> > > +   (set_attr "mode" "TI")])
> > > +
> > >  ;
> > >  ;;
> > >  ;; Parallel integral comparisons
> > > --
> > > 2.20.1
> > >
> > >
>
>
>
> --
> H.J.


Re: [PATCH 08/43] i386: Emulate MMX ashr3/3 with SSE

2019-02-10 Thread H.J. Lu
On Sun, Feb 10, 2019 at 2:26 AM Uros Bizjak  wrote:
>
> On 2/10/19, H.J. Lu  wrote:
> > Emulate MMX ashr3/3 with SSE.  Only SSE register
> > source operand is allowed.
> >
> >   PR target/89021
> >   * config/i386/mmx.md (mmx_ashr3): Disallow with
> >   TARGET_MMX_WITH_SSE.
> >   (mmx_3): Likewise.
> >   (ashr3): New.
> >   (3): Likewise.
>
> Please merge patterns use mmx_isa attribute.

Currently, MMX pattern names have a "mmx_" prefix.  For SSE emulation, we
don't want such a prefix so that the middle-end can detect and use them.  If we
remove the "mmx_" prefix from MMX pattern names, won't the middle-end
generate MMX instructions in this case?  It it safe to do so?


> Uros.
>
> > ---
> >  gcc/config/i386/mmx.md | 38 --
> >  1 file changed, 36 insertions(+), 2 deletions(-)
> >
> > diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> > index 2024c75fa78..9e07bf31f81 100644
> > --- a/gcc/config/i386/mmx.md
> > +++ b/gcc/config/i386/mmx.md
> > @@ -995,7 +995,7 @@
> >  (ashiftrt:MMXMODE24
> > (match_operand:MMXMODE24 1 "register_operand" "0")
> > (match_operand:DI 2 "nonmemory_operand" "yN")))]
> > -  "TARGET_MMX"
> > +  "TARGET_MMX && !TARGET_MMX_WITH_SSE"
> >"psra\t{%2, %0|%0, %2}"
> >[(set_attr "type" "mmxshft")
> > (set (attr "length_immediate")
> > @@ -1009,7 +1009,7 @@
> >  (any_lshift:MMXMODE248
> > (match_operand:MMXMODE248 1 "register_operand" "0")
> > (match_operand:DI 2 "nonmemory_operand" "yN")))]
> > -  "TARGET_MMX"
> > +  "TARGET_MMX && !TARGET_MMX_WITH_SSE"
> >"p\t{%2, %0|%0, %2}"
> >[(set_attr "type" "mmxshft")
> > (set (attr "length_immediate")
> > @@ -1018,6 +1018,40 @@
> > (const_string "0")))
> > (set_attr "mode" "DI")])
> >
> > +(define_insn "ashr3"
> > +  [(set (match_operand:MMXMODE24 0 "register_operand" "=x,Yv")
> > +(ashiftrt:MMXMODE24
> > +   (match_operand:MMXMODE24 1 "register_operand" "0,Yv")
> > +   (match_operand:DI 2 "nonmemory_operand" "xN,YvN")))]
> > +  "TARGET_MMX_WITH_SSE"
> > +  "@
> > +   psra\t{%2, %0|%0, %2}
> > +   vpsra\t{%2, %1, %0|%0, %1, %2}"
> > +  [(set_attr "isa" "noavx,avx")
> > +   (set_attr "type" "sseishft,sseishft")
> > +   (set (attr "length_immediate")
> > + (if_then_else (match_operand 2 "const_int_operand")
> > +   (const_string "1")
> > +   (const_string "0")))
> > +   (set_attr "mode" "TI")])
> > +
> > +(define_insn "3"
> > +  [(set (match_operand:MMXMODE248 0 "register_operand" "=x,Yv")
> > +(any_lshift:MMXMODE248
> > +   (match_operand:MMXMODE248 1 "register_operand" "0,Yv")
> > +   (match_operand:DI 2 "nonmemory_operand" "xN,YvN")))]
> > +  "TARGET_MMX_WITH_SSE"
> > +  "@
> > +   p\t{%2, %0|%0, %2}
> > +   vp\t{%2, %1, %0|%0, %1, %2}"
> > +  [(set_attr "isa" "noavx,avx")
> > +   (set_attr "type" "sseishft,sseishft")
> > +   (set (attr "length_immediate")
> > + (if_then_else (match_operand 2 "const_int_operand")
> > +   (const_string "1")
> > +   (const_string "0")))
> > +   (set_attr "mode" "TI")])
> > +
> >  ;
> >  ;;
> >  ;; Parallel integral comparisons
> > --
> > 2.20.1
> >
> >



-- 
H.J.


[PATCH v2, i386]: Fix PR89221, --enable-frame-pointer does not work as intended

2019-02-10 Thread Uros Bizjak
On Fri, Feb 8, 2019 at 1:24 PM Uros Bizjak  wrote:

> Attached patch fixes --enable-frame-pointer handling, and this way
> makes a couple of defines in config/i386/sol2.h obsolete.

It turned out that --enable-frame-pointer does not work for multilibs
at all. So, instead of pretending that -m32 on x86_64 and -m64 on i686
works as advertised, unify 32bit and 64bit handling.

2019-02-10  Uroš Bizjak  

PR target/89221
* config.gcc (i[34567]86-*-*, x86_64-*-*): Move tests for enable_cld
and enable_frame_pointer ...
* configure.ac: ... here.  Update help strings for
--enable-frame-pointer.
* configure: Regenerate.
* config/i386/i386.c (ix86_option_override_internal): Remove
USE_X86_64_FRAME_POINTER define, use USE_IX86_FRAME_POINTER instead.
* config/i386/sol2.h (USE_IX86_FRAME_POINTER): Remove.
(USE_X86_64_FRAME_POINTER): Ditto.

Please note that this fix will re-enable frame pointer for all targets
but linux* or darwin[[8912]]. However, since builds for e.g. cygwin
and mingw survived just well without frame pointers in the mean time,
we should probably list these targets as targets without frame
pointers by default. Maintainers should decide.

Which makes the patch gcc-10 material.

Uros.
Index: config/i386/sol2.h
===
--- config/i386/sol2.h  (revision 268670)
+++ config/i386/sol2.h  (working copy)
@@ -248,9 +248,6 @@
 #define ASAN_REJECT_SPEC \
   DEF_ARCH64_SPEC("%e:-fsanitize=address is not supported in this 
configuration")
 
-#define USE_IX86_FRAME_POINTER 1
-#define USE_X86_64_FRAME_POINTER 1
-
 #undef NO_PROFILE_COUNTERS
 
 #undef MCOUNT_NAME
Index: config.gcc
===
--- config.gcc  (revision 268670)
+++ config.gcc  (working copy)
@@ -604,12 +604,6 @@
echo "This target does not support --with-abi."
exit 1
fi
-   if test "x$enable_cld" = xyes; then
-   tm_defines="${tm_defines} USE_IX86_CLD=1"
-   fi
-   if test "x$enable_frame_pointer" = xyes; then
-   tm_defines="${tm_defines} USE_IX86_FRAME_POINTER=1"
-   fi
;;
 x86_64-*-*)
case ${with_abi} in
@@ -630,12 +624,6 @@
echo "Unknown ABI used in --with-abi=$with_abi"
exit 1
esac
-   if test "x$enable_cld" = xyes; then
-   tm_defines="${tm_defines} USE_IX86_CLD=1"
-   fi
-   if test "x$enable_frame_pointer" = xyes; then
-   tm_defines="${tm_defines} USE_IX86_FRAME_POINTER=1"
-   fi
;;
 arm*-*-*)
tm_p_file="arm/arm-flags.h ${tm_p_file} arm/aarch-common-protos.h"
Index: configure
===
--- configure   (revision 268670)
+++ configure   (working copy)
@@ -1688,8 +1688,7 @@
   --enable-leading-mingw64-underscores
   enable leading underscores on 64 bit mingw targets
   --enable-cldenable -mcld by default for 32bit x86
-  --enable-frame-pointer  enable -fno-omit-frame-pointer by default for 32bit
-  x86
+  --enable-frame-pointer  enable -fno-omit-frame-pointer by default for x86
   --disable-win32-registry
   disable lookup of installation paths in the Registry
   on Windows hosts
@@ -12199,8 +12198,7 @@
 
 case $target_os in
 linux* | darwin[8912]*)
-  # Enable -fomit-frame-pointer by default for Linux and Darwin with
-  # DWARF2.
+  # Enable -fomit-frame-pointer by default for Linux and Darwin with DWARF2.
   enable_frame_pointer=no
   ;;
 *)
@@ -12211,6 +12209,17 @@
 fi
 
 
+case $target in
+i[34567]86-*-* | x86_64-*-*)
+   if test "x$enable_cld" = xyes; then
+   tm_defines="${tm_defines} USE_IX86_CLD=1"
+   fi
+   if test "x$enable_frame_pointer" = xyes; then
+   tm_defines="${tm_defines} USE_IX86_FRAME_POINTER=1 
USE_X86_64_FRAME_POINTER=1"
+   fi
+   ;;
+esac
+
 # Windows32 Registry support for specifying GCC installation paths.
 # Check whether --enable-win32-registry was given.
 if test "${enable_win32_registry+set}" = set; then :
@@ -18646,7 +18655,7 @@
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 18640 "configure"
+#line 18658 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -18752,7 +18761,7 @@
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 18746 "configure"
+#line 18764 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -25141,7 +25150,7 @@
   no)
 ;;
   *)
-as_fn_error "'$enableval' is an invalid value for 
--enable-standard-branch-protection.\
+as_fn_error $? "'$enableval' is an invalid value for 
--enable-standard-branch-protection.\
   Valid choices are 'yes' and 

Re: Fix odr ICE on Ada LTO

2019-02-10 Thread Richard Biener
On February 10, 2019 11:48:01 AM GMT+01:00, Jan Hubicka  wrote:
>> 
>> This caused:
>> 
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89272
>
>My apologizes for that. Fixed by the attached patch. This is about ICE
>with -fno-lto-odr-type-merging which is option I think we should drop
>(probably next stage1 but if it shows to cause troubles, I would not be
>against dropping it from gcc 9)
>
>It is not useful and only leads to code duplication.
>
>lto-bootstrapped/regtested x86_64-linux, comitted.

Looks like you attached the wrong patch. BTW, the option is new, right? If it 
is and it serves no purpose please remove it now. 

Richard. 

>Honza
>
>Index: ChangeLog
>===
>--- ChangeLog  (revision 267609)
>+++ ChangeLog  (working copy)
>@@ -1,4 +1,14 @@
>+2019-01-03  Jan Hubicka  
>+  Backport from mainline
>+  2018-08-29  Jan Hubicka  
>+
>+  PR lto/86517
>+  PR lto/88185
>+  * lto-opts.c (lto_write_options): Always stream PIC/PIE mode.
>+  * lto-wrapper.c (merge_and_complain): Fix merging of PIC/PIE.
>+
> 2019-01-04  Aaron Sawdey  
>+
>   Backport from mainline
>   2018-11-28  Aaron Sawdey  
> 
>Index: lto-opts.c
>===
>--- lto-opts.c (revision 267609)
>+++ lto-opts.c (working copy)
>@@ -78,6 +78,21 @@ lto_write_options (void)
>   && !global_options.x_flag_openacc)
> append_to_collect_gcc_options (_obstack, _p,
>  "-fno-openacc");
>+  /* Append PIC/PIE mode because its default depends on target and it
>is
>+ subject of merging in lto-wrapper.  */
>+  if (!global_options_set.x_flag_pic &&
>!global_options_set.x_flag_pie)
>+{
>+   append_to_collect_gcc_options (_obstack, _p,
>+global_options.x_flag_pic == 2
>+? "-fPIC"
>+: global_options.x_flag_pic == 1
>+? "-fpic"
>+: global_options.x_flag_pie == 2
>+? "-fPIE"
>+: global_options.x_flag_pie == 1
>+? "-fpie"
>+: "-fno-pie");
>+}
> 
>/* Append options from target hook and store them to offload_lto
>section.  */
>   if (lto_stream_offload_p)
>Index: lto-wrapper.c
>===
>--- lto-wrapper.c  (revision 267609)
>+++ lto-wrapper.c  (working copy)
>@@ -408,6 +408,11 @@ merge_and_complain (struct cl_decoded_op
>It is a common mistake to mix few -fPIC compiled objects into otherwise
>  non-PIC code.  We do not want to build everything with PIC then.
> 
>+ Similarly we merge PIE options, however in addition we keep
>+  -fPIC + -fPIE = -fPIE
>+  -fpic + -fPIE = -fpie
>+  -fPIC/-fpic + -fpie = -fpie
>+
>It would be good to warn on mismatches, but it is bit hard to do as
>  we do not know what nothing translates to.  */
> 
>@@ -415,11 +420,38 @@ merge_and_complain (struct cl_decoded_op
> if ((*decoded_options)[j].opt_index == OPT_fPIC
> || (*decoded_options)[j].opt_index == OPT_fpic)
>   {
>-  if (!pic_option
>-  || (pic_option->value > 0) != ((*decoded_options)[j].value > 0))
>-remove_option (decoded_options, j, decoded_options_count);
>-  else if (pic_option->opt_index == OPT_fPIC
>-   && (*decoded_options)[j].opt_index == OPT_fpic)
>+  /* -fno-pic in one unit implies -fno-pic everywhere.  */
>+  if ((*decoded_options)[j].value == 0)
>+j++;
>+  /* If we have no pic option or merge in -fno-pic, we still may turn
>+ existing pic/PIC mode into pie/PIE if -fpie/-fPIE is present.  */
>+  else if ((pic_option && pic_option->value == 0)
>+   || !pic_option)
>+{
>+  if (pie_option)
>+{
>+  bool big = (*decoded_options)[j].opt_index == OPT_fPIC
>+ && pie_option->opt_index == OPT_fPIE;
>+  (*decoded_options)[j].opt_index = big ? OPT_fPIE : OPT_fpie;
>+  if (pie_option->value)
>+(*decoded_options)[j].canonical_option[0] = big ? "-fPIE" :
>"-fpie";
>+  else
>+(*decoded_options)[j].canonical_option[0] = big ?
>"-fno-pie" : "-fno-pie";
>+  (*decoded_options)[j].value = pie_option->value;
>+  j++;
>+}
>+  else if (pic_option)
>+{
>+  (*decoded_options)[j] = *pic_option;
>+  j++;
>+}
>+  /* We do not know if target defaults to pic or not, so just
>remove
>+ option if it is missing in one unit but enabled in other.  */
>+  else
>+remove_option (decoded_options, j, decoded_options_count);
>+}
>+ 

Fix canonical types of atomic types

2019-02-10 Thread Jan Hubicka
Hi,
build_qualified_type adjusts alignment of atomic types to one of minimal
alignment needed for atomic operations (I think it does so). For packed
structures this leads to type variant to be created and alignment to be
updated later.

If you call again build_qualified_type on packed structures, it won't
reuse existing type because check_base_type will compare alignment of
the base type (which is not atomic and has smaller alignment) and will
end up creating new variant.

When constructing a canonical types C frontned relies on types being
shared and this eventually leads to ice in type simplification.

I think it is easiest to teach check_base_type about minimal alignment.

Bootstrapped/regtested x86_64-linux.
PR lto/88585
* tree.c (find_atomic_core_type): Forward declare.
(check_base_type): Correctly compare alignments of atomic types.
Index: tree.c
===
--- tree.c  (revision 268742)
+++ tree.c  (working copy)
@@ -6329,18 +6329,33 @@ check_lang_type (const_tree cand, const_
   return lang_hooks.types.type_hash_eq (cand, base);
 }
 
+static tree find_atomic_core_type (const_tree type);
+
 /* Returns true iff unqualified CAND and BASE are equivalent.  */
 
 bool
 check_base_type (const_tree cand, const_tree base)
 {
-  return (TYPE_NAME (cand) == TYPE_NAME (base)
- /* Apparently this is needed for Objective-C.  */
- && TYPE_CONTEXT (cand) == TYPE_CONTEXT (base)
- /* Check alignment.  */
- && TYPE_ALIGN (cand) == TYPE_ALIGN (base)
- && attribute_list_equal (TYPE_ATTRIBUTES (cand),
-  TYPE_ATTRIBUTES (base)));
+  if (TYPE_NAME (cand) != TYPE_NAME (base)
+  /* Apparently this is needed for Objective-C.  */
+  || TYPE_CONTEXT (cand) != TYPE_CONTEXT (base)
+  || !attribute_list_equal (TYPE_ATTRIBUTES (cand),
+   TYPE_ATTRIBUTES (base)))
+return false;
+  /* Check alignment.  */
+  if (TYPE_ALIGN (cand) == TYPE_ALIGN (base))
+return true;
+  /* Atomic types increase minimal alignment.  We must to do so as well
+ or we get duplicated canonical types. See PR88686.  */
+  if ((TYPE_QUALS (cand) & TYPE_QUAL_ATOMIC))
+{
+  /* See if this object can map to a basic atomic type.  */
+  tree atomic_type = find_atomic_core_type (cand);
+  if (TYPE_ALIGN (atomic_type) == TYPE_ALIGN (cand)
+ && TYPE_ALIGN (base) < TYPE_ALIGN (cand))
+   return true;
+}
+  return false;
 }
 
 /* Returns true iff CAND is equivalent to BASE with TYPE_QUALS.  */
@@ -6373,7 +6388,7 @@ check_aligned_type (const_tree cand, con
atomic types, and returns that core atomic type.  */
 
 static tree
-find_atomic_core_type (tree type)
+find_atomic_core_type (const_tree type)
 {
   tree base_atomic_type;
 


Re: Do not use TYPE_NEED_CONSTRUCTING in may_be_aliased

2019-02-10 Thread Jan Hubicka
> Hi,
> this patch drops test for TYPE_NEEDS_CONSTRUCTING in tree.h and instead
> sets TREE_READONLY to 0 for external vars of this type. For vars
> declared locally we drop TREE_READONLY while expanding constructor.
> Note that I have tried to drop TREE_READONLY always (not only for
> DECL_EXTERNAL) and it breaks a testcase where constructor is constexpr.
> So perhaps this is unnecesarily conservative for external vars having
> constexpr cotr and perhaps it is better done by frontend.
> 
> Curiously enough, this does not fix the actual testcase in PR88677.
This turned out to be bug in my patch: I cleared the flag too late so
free_lang_data caused very much same effect as the may_be_aliased flag.
Here is updated patch, bootstrapped/regtested x86_64-linux. It also
fixes the testcase though I am not quite sure how to add it to
testsuite.
> 
> Bootstrapped/regtested x86_64-linux, makes sense?
> 
PR lto/88777
* cgraphunit.c (analyze_functions): Clear READONLY flag for external
types that needs constructiong.
* tree.h (may_be_aliased): Do not check TYPE_NEEDS_CONSTRUCTING.
Index: cgraphunit.c
===
--- cgraphunit.c(revision 268741)
+++ cgraphunit.c(working copy)
@@ -1226,6 +1226,15 @@ analyze_functions (bool first_time)
&& node != first_handled_var; node = next)
 {
   next = node->next;
+  /* For symbols declared locally we clear TREE_READONLY when emitting
+the construtor (if one is needed).  For external declarations we can
+not safely assume that the type is readonly because we may be called
+during its construction.  */
+  if (TREE_CODE (node->decl) == VAR_DECL
+ && TYPE_P (TREE_TYPE (node->decl))
+ && TYPE_NEEDS_CONSTRUCTING (TREE_TYPE (node->decl))
+ && DECL_EXTERNAL (node->decl))
+   TREE_READONLY (node->decl) = 0;
   if (!node->aux && !node->referred_to_p ())
{
  if (symtab->dump_file)
Index: tree.h
===
--- tree.h  (revision 268741)
+++ tree.h  (working copy)
@@ -5371,8 +5371,7 @@ may_be_aliased (const_tree var)
  || DECL_EXTERNAL (var)
  || TREE_ADDRESSABLE (var))
  && !((TREE_STATIC (var) || TREE_PUBLIC (var) || DECL_EXTERNAL (var))
-  && ((TREE_READONLY (var)
-   && !TYPE_NEEDS_CONSTRUCTING (TREE_TYPE (var)))
+  && (TREE_READONLY (var)
   || (TREE_CODE (var) == VAR_DECL
   && DECL_NONALIASED (var);
 }


New French PO file for 'gcc' (version 9.1-b20190203)

2019-02-10 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the French team of translators.  The file is available at:

https://translationproject.org/latest/gcc/fr.po

(This file, 'gcc-9.1-b20190203.fr.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




[PATCH] rs6000: Vector shift-right should honor modulo semantics

2019-02-10 Thread Bill Schmidt
Hi!

We had a problem report for code attempting to implement a vector right-shift 
for a
vector long long (V2DImode) type.  The programmer noted that we don't have a 
vector
splat-immediate for this mode, but cleverly realized he could use a vector char
splat-immediate since only the lower 6 bits of the shift count are read.  This 
is a
documented feature of both the vector shift built-ins and the underlying 
instruction.

Starting with GCC 8, the vector shifts are expanded early in 
rs6000_gimple_fold_builtin.
However, the GIMPLE folding does not currently perform the required 
TRUNC_MOD_EXPR to
implement the built-in semantics.  It appears that this was caught earlier for 
vector
shift-left and fixed, but the same problem was not fixed for vector shift-right.
This patch fixes that.

While fixing that problem, I noted that we get inferior code generation when we 
try
to fold the vector char splat earlier, due to the type mismatch and some 
additional
optimizations performed in the middle end.  Because this is a rare 
circumstance, it
makes sense to avoid the GIMPLE folding in this case and allow the back end to 
do
the expansion; this produces the clean code we're expecting with the vspltisb 
intact.

I've added executable tests for both shift-right algebraic and shift-right 
logical.
Both fail prior to applying the patch, and work correctly afterwards.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.  
Is
this okay for trunk, and for GCC 8.3 if there is no fallout by the end of the
week?

Thanks,
Bill


[gcc]

2019-02-08  Bill Schmidt  

* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Shift-right
and shift-left vector built-ins need to include a TRUNC_MOD_EXPR
for correct semantics.  Also, don't expand a vector-splat if there
is a type mismatch; let the back end handle it.

[gcc/testsuite]

2019-02-08  Bill Schmidt  

* gcc.target/powerpc/srad-modulo.c: New.
* gcc.target/powerpc/srd-modulo.c: New.


Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 268707)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -15735,13 +15735,37 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *
 case ALTIVEC_BUILTIN_VSRAH:
 case ALTIVEC_BUILTIN_VSRAW:
 case P8V_BUILTIN_VSRAD:
-  arg0 = gimple_call_arg (stmt, 0);
-  arg1 = gimple_call_arg (stmt, 1);
-  lhs = gimple_call_lhs (stmt);
-  g = gimple_build_assign (lhs, RSHIFT_EXPR, arg0, arg1);
-  gimple_set_location (g, gimple_location (stmt));
-  gsi_replace (gsi, g, true);
-  return true;
+  {
+   arg0 = gimple_call_arg (stmt, 0);
+   arg1 = gimple_call_arg (stmt, 1);
+   lhs = gimple_call_lhs (stmt);
+   tree arg1_type = TREE_TYPE (arg1);
+   tree unsigned_arg1_type = unsigned_type_for (TREE_TYPE (arg1));
+   tree unsigned_element_type = unsigned_type_for (TREE_TYPE (arg1_type));
+   location_t loc = gimple_location (stmt);
+   /* Force arg1 into the range valid matching the arg0 type.  */
+   /* Build a vector consisting of the max valid bit-size values.  */
+   int n_elts = VECTOR_CST_NELTS (arg1);
+   tree element_size = build_int_cst (unsigned_element_type,
+  128 / n_elts);
+   tree_vector_builder elts (unsigned_arg1_type, n_elts, 1);
+   for (int i = 0; i < n_elts; i++)
+ elts.safe_push (element_size);
+   tree modulo_tree = elts.build ();
+   /* Modulo the provided shift value against that vector.  */
+   gimple_seq stmts = NULL;
+   tree unsigned_arg1 = gimple_build (, VIEW_CONVERT_EXPR,
+  unsigned_arg1_type, arg1);
+   tree new_arg1 = gimple_build (, loc, TRUNC_MOD_EXPR,
+ unsigned_arg1_type, unsigned_arg1,
+ modulo_tree);
+   gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+   /* And finally, do the shift.  */
+   g = gimple_build_assign (lhs, RSHIFT_EXPR, arg0, new_arg1);
+   gimple_set_location (g, loc);
+   gsi_replace (gsi, g, true);
+   return true;
+  }
/* Flavors of vector shift left.
   builtin_altivec_vsl{b,h,w} -> vsl{b,h,w}.  */
 case ALTIVEC_BUILTIN_VSLB:
@@ -15795,14 +15819,34 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *
arg0 = gimple_call_arg (stmt, 0);
arg1 = gimple_call_arg (stmt, 1);
lhs = gimple_call_lhs (stmt);
+   tree arg1_type = TREE_TYPE (arg1);
+   tree unsigned_arg1_type = unsigned_type_for (TREE_TYPE (arg1));
+   tree unsigned_element_type = unsigned_type_for (TREE_TYPE (arg1_type));
+   location_t loc = gimple_location (stmt);
gimple_seq stmts = NULL;
/* Convert arg0 to unsigned.  */
tree arg0_unsigned
  = gimple_build (, VIEW_CONVERT_EXPR,
  

Re: [patch, fortran] Fix PR 71237

2019-02-10 Thread Paul Richard Thomas
OK. Thanks for the patch.

Paul

On Wed, 6 Feb 2019 at 20:27, Thomas Koenig  wrote:
>
> Hello world,
>
> this patch fixes a 7/8/9 regression where we tried to accept invalid
> code, which led to an ICE later on.
>
> The patch is rather straightforward.  The reason why I could not
> use gfc_expr_attr is that it does not actually return the
> flags the way they can be found in the original attributes;
> for example, an expression containing a pointer attribute is
> shown as having the target attribute, for reasons I cannot
> fathom.
>
> Regression-tested.  OK for trunk and other open branches?
>
> Regards
>
> Thomas
>
> 2019-02-06  Thomas Koenig  
>
> PR fortran/71237
> * expr.c (gfc_check_assign): Add argument is_init_expr.  If we are
> looking at an init expression, issue error if the target is not a
> TARGET and we are not looking at a procedure pointer.
> * gfortran.h (gfc_check_assign): Add optional argument
> is_init_expr.
>
> 2019-02-06  Thomas Koenig  
>
> PR fortran/71237
> * gfortran.dg/pointer_init_2.f90: Adjust error messages.
> * gfortran.dg/pointer_init_6.f90: Likewise.
> * gfortran.dg/pointer_init_9.f90: New test.



-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein


Re: Make clear, when contributions will be ignored

2019-02-10 Thread Дилян Палаузов
Hello,

thanks to Serger and Joseph for the feedback.

Acting primary upon reminders is a general phenomenon in the society, nothing 
specific to software teams.  Think on
public administration: it acts sometimes much more collaboratively, if a 
public/private/famous media reports on the
workflows of the public administration.  Public administration also reacts 
sometimes only, if reminders are sent.

Not surprizing is, that talking with a public administration, about their 
policy on acting only after receiving a
reminder, leads to nowhere, as making progress on this discussion with such an 
administration, needs a lot of reminders.
In summary, such public administrations insist on their right to receive 
reminders before acting.

Do you share the opinion, that whatever can be done after receiving a reminder, 
can be arranged also without reminder? 
If yes, how do you propose to proceed, so that a 
no-reminders-are-necessary-state is reached?

I read in the answer of Segher, that the purpose of reminding is not only to 
ping, but also to filter the ones who are
pernetrant and sending manually reminders is the means to verify, that the 
persons really want to make progress.  It was
certainly not intentionally meant this way, but this is a possible reading.

Let me repeat, that the topic is not anyhow GCC specific, nor do I offend the 
society anyhow.  To make things better,
first the causes for the current state have to be understood.

Raising the topic on GNU Tools Cauldron is a very good idea, but it likely 
approaches less people than on this mailing
list, I am not that much inside the GCC processes and I do not know, whether I 
can visit the next meeting.

Regards
  Дилян

On Wed, 2019-02-06 at 06:44 -0600, Segher Boessenkool wrote:
> On Fri, Dec 07, 2018 at 10:55:11AM +, Дилян Палаузов wrote:
> > will it help, if Bugzilla is reprogrammed to send automatically weekly
> > reminders on all patches, that are not integrated yet?
> 
> No, that will not help.
> 
> If an interested party sends a friendly ping, that is of course welcome.
> But automated pings are spam: unwanted bulk mail.
> 
> > The patch I proposed on 27th Oct was first submitted towards GDB and
> > then I was told to send it to GCC.  Here I was told to sent it to GDB. 
> > What shall happen to quit the loop?
> 
> You can cc: both sides of the discussion.  Either also gdb-patches, or also
> whoever told you to send it to GCC instead, or both.  And include a link to
> the mailing list archive of your thread on gdb-patches in your mail to
> gcc-patches, so that all parties can see the relevant context.  Make it
> easy for people to help you!
> 
> 
> Segher



[PATCH PR d/88654] Committed phobos fix for thread deadlock in std.net.curl

2019-02-10 Thread Iain Buclaw
Hi,

This patch is the library fix for a thread deadlock that occurred when
libcurl is missing.  It is only one half of the fix for the PR, the
other is for the testsuite scripts to check that libcurl exists before
attempting to run the std.net.curl unittest.

Bootstrapped and tested on x86_64-linux-gnu (-m32) to verify test goes
from timed out to just fail.

Committed to trunk as r268746.

-- 
Iain
---
diff --git a/libphobos/src/MERGE b/libphobos/src/MERGE
index eee413903c0..aef240e0722 100644
--- a/libphobos/src/MERGE
+++ b/libphobos/src/MERGE
@@ -1,4 +1,4 @@
-d4933a90b1e8446c04d64cd044658f2b33250bd3
+6c9fb28b0f8813d41798202a9d19c6b37ba5da5f
 
 The first line of this file holds the git revision number of the last
 merge done from the dlang/phobos repository.
diff --git a/libphobos/src/std/net/curl.d b/libphobos/src/std/net/curl.d
index 9d751411705..e3ce527c303 100644
--- a/libphobos/src/std/net/curl.d
+++ b/libphobos/src/std/net/curl.d
@@ -178,7 +178,7 @@ version (unittest)
 import std.range;
 import std.stdio;
 
-import std.socket : Address, INADDR_LOOPBACK, Socket, TcpSocket;
+import std.socket : Address, INADDR_LOOPBACK, Socket, SocketShutdown, TcpSocket;
 
 private struct TestServer
 {
@@ -192,6 +192,7 @@ version (unittest)
 private:
 string _addr;
 Tid tid;
+TcpSocket sock;
 
 static void loop(shared TcpSocket listener)
 {
@@ -215,20 +216,34 @@ version (unittest)
 
 private TestServer startServer()
 {
+tlsInit = true;
 auto sock = new TcpSocket;
 sock.bind(new InternetAddress(INADDR_LOOPBACK, InternetAddress.PORT_ANY));
 sock.listen(1);
 auto addr = sock.localAddress.toString();
 auto tid = spawn(, cast(shared) sock);
-return TestServer(addr, tid);
+return TestServer(addr, tid, sock);
 }
 
+__gshared TestServer server;
+bool tlsInit;
+
 private ref TestServer testServer()
 {
-__gshared TestServer server;
 return initOnce!server(startServer());
 }
 
+static ~this()
+{
+// terminate server from a thread local dtor of the thread that started it,
+//  because thread_joinall is called before shared module dtors
+if (tlsInit && server.sock)
+{
+server.sock.shutdown(SocketShutdown.RECEIVE);
+server.sock.close();
+}
+}
+
 private struct Request(T)
 {
 string hdrs;
@@ -429,7 +444,11 @@ if (isCurlConn!Conn)
 s.send(httpOK("Hello world"));
 });
 auto fn = std.file.deleteme;
-scope (exit) std.file.remove(fn);
+scope (exit)
+{
+if (std.file.exists(fn))
+std.file.remove(fn);
+}
 download(host, fn);
 assert(std.file.readText(fn) == "Hello world");
 }
@@ -491,7 +510,11 @@ if (isCurlConn!Conn)
 foreach (host; [testServer.addr, "http://"~testServer.addr])
 {
 auto fn = std.file.deleteme;
-scope (exit) std.file.remove(fn);
+scope (exit)
+{
+if (std.file.exists(fn))
+std.file.remove(fn);
+}
 std.file.write(fn, "upload data\n");
 testServer.handle((s) {
 auto req = s.recvReq;


Re: [PATCH 41/43] i386: Implement V2SF add/sub/mul with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> In 64-bit mode, implement V2SF add/sub/mul with SEE.  Only SSE register
> source operand is allowed.
>
> gcc/
>
>   PR target/89028
>   * config/i386/i386.md (comm): Handle mult.
>   * config/i386/mmx.md (plusminusmult): New.
>   (plusminusmult_insn): Likewse.
>   (plusminusmult_mnemonic): Likewse.
>   (plusminusmult_type): Likewse.
>   (mmx_addv2sf3): Add "&& !TARGET_MMX_WITH_SSE".
>   (*mmx_addv2sf3): Likewise.
>   (mmx_subv2sf3): Likewise.
>   (mmx_subrv2sf3): Likewise.
>   (*mmx_subv2sf3): Likewise.
>   (mmx_mulv2sf3): Likewise.
>   (*mmx_mulv2sf3): Likewise.
>   (v2sf3): New.
>   (*sse_v2sf3): Likewise.

No. There is no native support for V2SF in SSE, so we'll leave these out.

Uros.

>
> gcc/testsuite/
>
>   PR target/89028
>   * gcc.target/i386/pr89028-2.c: New test.
>   * gcc.target/i386/pr89028-3.c: Likewise.
>   * gcc.target/i386/pr89028-4.c: Likewise.
>   * gcc.target/i386/pr89028-5.c: Likewise.
>   * gcc.target/i386/pr89028-6.c: Likewise.
>   * gcc.target/i386/pr89028-7.c: Likewise.
> ---
>  gcc/config/i386/i386.md   |  3 +-
>  gcc/config/i386/mmx.md| 56 ---
>  gcc/testsuite/gcc.target/i386/pr89028-2.c | 11 +
>  gcc/testsuite/gcc.target/i386/pr89028-3.c | 14 ++
>  gcc/testsuite/gcc.target/i386/pr89028-4.c | 14 ++
>  gcc/testsuite/gcc.target/i386/pr89028-5.c | 11 +
>  gcc/testsuite/gcc.target/i386/pr89028-6.c | 14 ++
>  gcc/testsuite/gcc.target/i386/pr89028-7.c | 14 ++
>  8 files changed, 129 insertions(+), 8 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr89028-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr89028-3.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr89028-4.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr89028-5.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr89028-6.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr89028-7.c
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index 72685107fc0..cda973c0fbf 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -873,7 +873,8 @@
>
>  ;; Mark commutative operators as such in constraints.
>  (define_code_attr comm [(plus "%") (ss_plus "%") (us_plus "%")
> - (minus "") (ss_minus "") (us_minus "")])
> + (minus "") (ss_minus "") (us_minus "")
> + (mult "%")])
>
>  ;; Mapping of max and min
>  (define_code_iterator maxmin [smax smin umax umin])
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index e56d2e71168..88c1ecd9ae6 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -63,6 +63,20 @@
>  ;; Instruction suffix for truncations with saturation.
>  (define_code_attr s_trunsuffix [(ss_truncate "s") (us_truncate "u")])
>
> +(define_code_iterator plusminusmult [plus minus mult])
> +
> +;; Base name for define_insn
> +(define_code_attr plusminusmult_insn
> +  [(plus "add") (minus "sub") (mult "mul")])
> +
> +;; Base name for insn mnemonic.
> +(define_code_attr plusminusmult_mnemonic
> +  [(plus "add") (minus "sub") (mult "mul")])
> +
> +;; Insn type name for insn mnemonic.
> +(define_code_attr plusminusmult_type
> +  [(plus "add") (minus "add") (mult "mul")])
> +
>  ;
>  ;;
>  ;; Move patterns
> @@ -279,14 +293,16 @@
>   (plus:V2SF
> (match_operand:V2SF 1 "nonimmediate_operand")
> (match_operand:V2SF 2 "nonimmediate_operand")))]
> -  "TARGET_3DNOW"
> +  "TARGET_3DNOW && !TARGET_MMX_WITH_SSE"
>"ix86_fixup_binary_operands_no_copy (PLUS, V2SFmode, operands);")
>
>  (define_insn "*mmx_addv2sf3"
>[(set (match_operand:V2SF 0 "register_operand" "=y")
>   (plus:V2SF (match_operand:V2SF 1 "nonimmediate_operand" "%0")
>  (match_operand:V2SF 2 "nonimmediate_operand" "ym")))]
> -  "TARGET_3DNOW && ix86_binary_operator_ok (PLUS, V2SFmode, operands)"
> +  "TARGET_3DNOW
> +   && !TARGET_MMX_WITH_SSE
> +   && ix86_binary_operator_ok (PLUS, V2SFmode, operands)"
>"pfadd\t{%2, %0|%0, %2}"
>[(set_attr "type" "mmxadd")
> (set_attr "prefix_extra" "1")
> @@ -296,19 +312,21 @@
>[(set (match_operand:V2SF 0 "register_operand")
>  (minus:V2SF (match_operand:V2SF 1 "register_operand")
>   (match_operand:V2SF 2 "nonimmediate_operand")))]
> -  "TARGET_3DNOW")
> +  "TARGET_3DNOW && !TARGET_MMX_WITH_SSE")
>
>  (define_expand "mmx_subrv2sf3"
>[(set (match_operand:V2SF 0 "register_operand")
>  (minus:V2SF (match_operand:V2SF 2 "register_operand")
>   (match_operand:V2SF 1 "nonimmediate_operand")))]
> -  "TARGET_3DNOW")
> +  "TARGET_3DNOW && !TARGET_MMX_WITH_SSE")
>
>  (define_insn "*mmx_subv2sf3"
>[(set (match_operand:V2SF 0 "register_operand" "=y,y")
>  (minus:V2SF (match_operand:V2SF 1 

Re: [PATCH 35/43] i386: Allow MMXMODE moves with TARGET_MMX_WITH_SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
>   PR target/89021
>   * config/i386/mmx.md (MMXMODE:mov): Also allow
>   TARGET_MMX_WITH_SSE.
>   (MMXMODE:*mov_internal): Likewise.
>   (MMXMODE:movmisalign): Likewise.

OK.

Uros.

> ---
>  gcc/config/i386/mmx.md | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index dafc6c4dcb8..25954891b11 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -75,7 +75,7 @@
>  (define_expand "mov"
>[(set (match_operand:MMXMODE 0 "nonimmediate_operand")
>   (match_operand:MMXMODE 1 "nonimmediate_operand"))]
> -  "TARGET_MMX"
> +  "TARGET_MMX || TARGET_MMX_WITH_SSE"
>  {
>ix86_expand_vector_move (mode, operands);
>DONE;
> @@ -86,7 +86,7 @@
>  "=r ,o ,r,r ,m ,?!y,!y,?!y,m  ,r  ,?!y,v,v,v,m,r,v,!y,*x")
>   (match_operand:MMXMODE 1 "nonimm_or_0_operand"
>  "rCo,rC,C,rm,rC,C  ,!y,m  ,?!y,?!y,r  ,C,v,m,v,v,r,*x,!y"))]
> -  "TARGET_MMX
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
> && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
>  {
>switch (get_attr_type (insn))
> @@ -237,7 +237,7 @@
>  (define_expand "movmisalign"
>[(set (match_operand:MMXMODE 0 "nonimmediate_operand")
>   (match_operand:MMXMODE 1 "nonimmediate_operand"))]
> -  "TARGET_MMX"
> +  "TARGET_MMX || TARGET_MMX_WITH_SSE"
>  {
>ix86_expand_vector_move (mode, operands);
>DONE;
> --
> 2.20.1
>
>


Re: [PATCH 34/43] i386: Emulate MMX abs2 with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX abs2 with SSE.  Only SSE register source operand is
> allowed.
>
>   PR target/89021
>   * config/i386/sse.md (abs2): Add SSE emulation.

OK.

Uros.

> ---
>  gcc/config/i386/sse.md | 15 +--
>  1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index e0ea8ab300b..018b1dca984 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -16092,16 +16092,19 @@
>  })
>
>  (define_insn "abs2"
> -  [(set (match_operand:MMXMODEI 0 "register_operand" "=y")
> +  [(set (match_operand:MMXMODEI 0 "register_operand" "=y,Yv")
>   (abs:MMXMODEI
> -   (match_operand:MMXMODEI 1 "nonimmediate_operand" "ym")))]
> -  "TARGET_SSSE3"
> -  "pabs\t{%1, %0|%0, %1}";
> -  [(set_attr "type" "sselog1")
> +   (match_operand:MMXMODEI 1 "nonimmediate_operand" "ym,Yv")))]
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
> +  "@
> +   pabs\t{%1, %0|%0, %1}
> +   %vpabs\t{%1, %0|%0, %1}"
> +  [(set_attr "mmx_isa" "native,x64")
> +   (set_attr "type" "sselog1")
> (set_attr "prefix_rep" "0")
> (set_attr "prefix_extra" "1")
> (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p
> (insn)"))
> -   (set_attr "mode" "DI")])
> +   (set_attr "mode" "DI,TI")])
>
>  ;
>  ;;
> --
> 2.20.1
>
>


Re: [PATCH 32/43] i386: Emulate MMX ssse3_psign3 with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX ssse3_psign3 with SSE.  Only SSE register source operand
> is allowed.
>
>   PR target/89021
>   * config/i386/sse.md (ssse3_psign3): Add SSE emulation.

OK.

Uros.

> ---
>  gcc/config/i386/sse.md | 18 +++---
>  1 file changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 00e1fe03995..c3dcb6bc6b1 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -15908,17 +15908,21 @@
> (set_attr "mode" "")])
>
>  (define_insn "ssse3_psign3"
> -  [(set (match_operand:MMXMODEI 0 "register_operand" "=y")
> +  [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,Yv")
>   (unspec:MMXMODEI
> -   [(match_operand:MMXMODEI 1 "register_operand" "0")
> -(match_operand:MMXMODEI 2 "nonimmediate_operand" "ym")]
> +   [(match_operand:MMXMODEI 1 "register_operand" "0,0,Yv")
> +(match_operand:MMXMODEI 2 "nonimmediate_operand" "ym,x,Yv")]
> UNSPEC_PSIGN))]
> -  "TARGET_SSSE3"
> -  "psign\t{%2, %0|%0, %2}";
> -  [(set_attr "type" "sselog1")
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
> +  "@
> +   psign\t{%2, %0|%0, %2}
> +   psign\t{%2, %0|%0, %2}
> +   vpsign\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "sselog1")
> (set_attr "prefix_extra" "1")
> (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p
> (insn)"))
> -   (set_attr "mode" "DI")])
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_insn "_palignr_mask"
>[(set (match_operand:VI1_AVX512 0 "register_operand" "=v")
> --
> 2.20.1
>
>


Re: [PATCH 29/43] i386: Emulate MMX ssse3_pmaddubsw with SSE

2019-02-10 Thread graham stott via gcc-patches
what about testcases? for these 


 Original message 
From: Uros Bizjak  
Date: 10/02/2019  12:26  (GMT+00:00) 
To: "H.J. Lu"  
Cc: gcc-patches@gcc.gnu.org 
Subject: Re: [PATCH 29/43] i386: Emulate MMX ssse3_pmaddubsw with SSE 

On 2/10/19, H.J. Lu  wrote:
> Emulate MMX ssse3_pmaddubsw with SSE.  Only SSE register source operand
> is allowed.
>
>   PR target/89021
>   * config/i386/sse.md (ssse3_pmaddubsw): Add SSE emulation.

OK.

Uros.

> ---
>  gcc/config/i386/sse.md | 18 +++---
>  1 file changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 4bcfd3fc272..8b13a76da72 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -15666,17 +15666,17 @@
> (set_attr "mode" "TI")])
>
>  (define_insn "ssse3_pmaddubsw"
> -  [(set (match_operand:V4HI 0 "register_operand" "=y")
> +  [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv")
>   (ss_plus:V4HI
>     (mult:V4HI
>       (zero_extend:V4HI
>     (vec_select:V4QI
> - (match_operand:V8QI 1 "register_operand" "0")
> + (match_operand:V8QI 1 "register_operand" "0,0,Yv")
>   (parallel [(const_int 0) (const_int 2)
>      (const_int 4) (const_int 6)])))
>       (sign_extend:V4HI
>     (vec_select:V4QI
> - (match_operand:V8QI 2 "nonimmediate_operand" "ym")
> + (match_operand:V8QI 2 "nonimmediate_operand" "ym,x,Yv")
>   (parallel [(const_int 0) (const_int 2)
>      (const_int 4) (const_int 6)]
>     (mult:V4HI
> @@ -15688,13 +15688,17 @@
>     (vec_select:V4QI (match_dup 2)
>   (parallel [(const_int 1) (const_int 3)
>      (const_int 5) (const_int 7)]))]
> -  "TARGET_SSSE3"
> -  "pmaddubsw\t{%2, %0|%0, %2}"
> -  [(set_attr "type" "sseiadd")
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
> +  "@
> +   pmaddubsw\t{%2, %0|%0, %2}
> +   pmaddubsw\t{%2, %0|%0, %2}
> +   vpmaddubsw\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "sseiadd")
> (set_attr "atom_unit" "simul")
> (set_attr "prefix_extra" "1")
> (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p
> (insn)"))
> -   (set_attr "mode" "DI")])
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_mode_iterator PMULHRSW
>    [V4HI V8HI (V16HI "TARGET_AVX2")])
> --
> 2.20.1
>
>


Re: [PATCH 30/43] i386: Emulate MMX ssse3_pmulhrswv4hi3 with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX ssse3_pmulhrswv4hi3 with SSE.  Only SSE register source
> operand is allowed.
>
>   PR target/89021
>   * config/i386/sse.md (*ssse3_pmulhrswv4hi3): Add SSE emulation.

OK.

Uros.

> ---
>  gcc/config/i386/sse.md | 20 +---
>  1 file changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 8b13a76da72..0d0f84705d1 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -15774,25 +15774,31 @@
> (set_attr "mode" "")])
>
>  (define_insn "*ssse3_pmulhrswv4hi3"
> -  [(set (match_operand:V4HI 0 "register_operand" "=y")
> +  [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv")
>   (truncate:V4HI
> (lshiftrt:V4SI
>   (plus:V4SI
> (lshiftrt:V4SI
>   (mult:V4SI
> (sign_extend:V4SI
> - (match_operand:V4HI 1 "nonimmediate_operand" "%0"))
> + (match_operand:V4HI 1 "nonimmediate_operand" "%0,0,Yv"))
> (sign_extend:V4SI
> - (match_operand:V4HI 2 "nonimmediate_operand" "ym")))
> + (match_operand:V4HI 2 "nonimmediate_operand" "ym,x,Yv")))
>   (const_int 14))
> (match_operand:V4HI 3 "const1_operand"))
>   (const_int 1]
> -  "TARGET_SSSE3 && !(MEM_P (operands[1]) && MEM_P (operands[2]))"
> -  "pmulhrsw\t{%2, %0|%0, %2}"
> -  [(set_attr "type" "sseimul")
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
> +   && TARGET_SSSE3
> +   && !(MEM_P (operands[1]) && MEM_P (operands[2]))"
> +  "@
> +   pmulhrsw\t{%2, %0|%0, %2}
> +   pmulhrsw\t{%2, %0|%0, %2}
> +   vpmulhrsw\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "sseimul")
> (set_attr "prefix_extra" "1")
> (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p
> (insn)"))
> -   (set_attr "mode" "DI")])
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_insn "_pshufb3"
>[(set (match_operand:VI1_AVX512 0 "register_operand" "=x,x,v")
> --
> 2.20.1
>
>


Re: [PATCH 29/43] i386: Emulate MMX ssse3_pmaddubsw with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX ssse3_pmaddubsw with SSE.  Only SSE register source operand
> is allowed.
>
>   PR target/89021
>   * config/i386/sse.md (ssse3_pmaddubsw): Add SSE emulation.

OK.

Uros.

> ---
>  gcc/config/i386/sse.md | 18 +++---
>  1 file changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 4bcfd3fc272..8b13a76da72 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -15666,17 +15666,17 @@
> (set_attr "mode" "TI")])
>
>  (define_insn "ssse3_pmaddubsw"
> -  [(set (match_operand:V4HI 0 "register_operand" "=y")
> +  [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv")
>   (ss_plus:V4HI
> (mult:V4HI
>   (zero_extend:V4HI
> (vec_select:V4QI
> - (match_operand:V8QI 1 "register_operand" "0")
> + (match_operand:V8QI 1 "register_operand" "0,0,Yv")
>   (parallel [(const_int 0) (const_int 2)
>  (const_int 4) (const_int 6)])))
>   (sign_extend:V4HI
> (vec_select:V4QI
> - (match_operand:V8QI 2 "nonimmediate_operand" "ym")
> + (match_operand:V8QI 2 "nonimmediate_operand" "ym,x,Yv")
>   (parallel [(const_int 0) (const_int 2)
>  (const_int 4) (const_int 6)]
> (mult:V4HI
> @@ -15688,13 +15688,17 @@
> (vec_select:V4QI (match_dup 2)
>   (parallel [(const_int 1) (const_int 3)
>  (const_int 5) (const_int 7)]))]
> -  "TARGET_SSSE3"
> -  "pmaddubsw\t{%2, %0|%0, %2}"
> -  [(set_attr "type" "sseiadd")
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
> +  "@
> +   pmaddubsw\t{%2, %0|%0, %2}
> +   pmaddubsw\t{%2, %0|%0, %2}
> +   vpmaddubsw\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "sseiadd")
> (set_attr "atom_unit" "simul")
> (set_attr "prefix_extra" "1")
> (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p
> (insn)"))
> -   (set_attr "mode" "DI")])
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_mode_iterator PMULHRSW
>[V4HI V8HI (V16HI "TARGET_AVX2")])
> --
> 2.20.1
>
>


Re: [PATCH 28/43] i386: Emulate MMX ssse3_phdv2si3 with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX ssse3_phdv2si3 with SSE by moving bits
> 64:95 to bits 32:63 in SSE register.  Only SSE register source operand
> is allowed.
>
>   PR target/89021
>   * config/i386/sse.md (ssse3_phdv2si3):
>   Changed to define_insn_and_split to support SSE emulation.
> ---
>  gcc/config/i386/sse.md | 32 
>  1 file changed, 24 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 6138f59f267..4bcfd3fc272 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -15480,26 +15480,42 @@
> (set_attr "prefix" "orig,vex")
> (set_attr "mode" "TI")])
>
> -(define_insn "ssse3_phdv2si3"
> -  [(set (match_operand:V2SI 0 "register_operand" "=y")
> +(define_insn_and_split "ssse3_phdv2si3"
> +  [(set (match_operand:V2SI 0 "register_operand" "=y,x,Yv")
>   (vec_concat:V2SI
> (plusminus:SI
>   (vec_select:SI
> -   (match_operand:V2SI 1 "register_operand" "0")
> +   (match_operand:V2SI 1 "register_operand" "0,0,Yv")
> (parallel [(const_int 0)]))
>   (vec_select:SI (match_dup 1) (parallel [(const_int 1)])))
> (plusminus:SI
>   (vec_select:SI
> -   (match_operand:V2SI 2 "nonimmediate_operand" "ym")
> +   (match_operand:V2SI 2 "nonimmediate_operand" "ym,x,Yv")
> (parallel [(const_int 0)]))
>   (vec_select:SI (match_dup 2) (parallel [(const_int 1)])]
> -  "TARGET_SSSE3"
> -  "phd\t{%2, %0|%0, %2}"
> -  [(set_attr "type" "sseiadd")
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
> +  "@
> +   phd\t{%2, %0|%0, %2}
> +   #
> +   #"
> +  "&& reload_completed && TARGET_MMX_WITH_SSE"

Please fix split condition.

Uros.

> +  [(const_int 0)]
> +{
> +  /* Generate SSE version of the operation.  */
> +  rtx op0 = gen_rtx_REG (V4SImode, REGNO (operands[0]));
> +  rtx op1 = gen_rtx_REG (V4SImode, REGNO (operands[1]));
> +  rtx op2 = gen_rtx_REG (V4SImode, REGNO (operands[2]));
> +  rtx insn = gen_ssse3_phdv4si3 (op0, op1, op2);
> +  emit_insn (insn);
> +  ix86_move_vector_high_sse_to_mmx (op0);
> +  DONE;
> +}
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "sseiadd")
> (set_attr "atom_unit" "complex")
> (set_attr "prefix_extra" "1")
> (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p
> (insn)"))
> -   (set_attr "mode" "DI")])
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_insn "avx2_pmaddubsw256"
>[(set (match_operand:V16HI 0 "register_operand" "=x,v")
> --
> 2.20.1
>
>


Re: [PATCH 27/43] i386: Emulate MMX ssse3_phwv4hi3 with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX ssse3_phwv4hi3 with SSE by moving bits
> 64:95 to bits 32:63 in SSE register.  Only SSE register source operand
> is allowed.
>
>   PR target/89021
>   * config/i386/sse.md (ssse3_phwv4hi3):
>   Changed to define_insn_and_split to support SSE emulation.
> ---
>  gcc/config/i386/sse.md | 32 
>  1 file changed, 24 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 4be5ab44e81..6138f59f267 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -15358,13 +15358,13 @@
> (set_attr "prefix" "orig,vex")
> (set_attr "mode" "TI")])
>
> -(define_insn "ssse3_phwv4hi3"
> -  [(set (match_operand:V4HI 0 "register_operand" "=y")
> +(define_insn_and_split "ssse3_phwv4hi3"
> +  [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv")
>   (vec_concat:V4HI
> (vec_concat:V2HI
>   (ssse3_plusminus:HI
> (vec_select:HI
> - (match_operand:V4HI 1 "register_operand" "0")
> + (match_operand:V4HI 1 "register_operand" "0,0,Yv")
>   (parallel [(const_int 0)]))
> (vec_select:HI (match_dup 1) (parallel [(const_int 1)])))
>   (ssse3_plusminus:HI
> @@ -15373,19 +15373,35 @@
> (vec_concat:V2HI
>   (ssse3_plusminus:HI
> (vec_select:HI
> - (match_operand:V4HI 2 "nonimmediate_operand" "ym")
> + (match_operand:V4HI 2 "nonimmediate_operand" "ym,x,Yv")
>   (parallel [(const_int 0)]))
> (vec_select:HI (match_dup 2) (parallel [(const_int 1)])))
>   (ssse3_plusminus:HI
> (vec_select:HI (match_dup 2) (parallel [(const_int 2)]))
> (vec_select:HI (match_dup 2) (parallel [(const_int 3)]))]
> -  "TARGET_SSSE3"
> -  "phw\t{%2, %0|%0, %2}"
> -  [(set_attr "type" "sseiadd")
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
> +  "@
> +   phw\t{%2, %0|%0, %2}
> +   #
> +   #"
> +  "&& reload_completed && TARGET_MMX_WITH_SSE"

Please fix split condition.

Uros.

> +  [(const_int 0)]
> +{
> +  /* Generate SSE version of the operation.  */
> +  rtx op0 = gen_rtx_REG (V8HImode, REGNO (operands[0]));
> +  rtx op1 = gen_rtx_REG (V8HImode, REGNO (operands[1]));
> +  rtx op2 = gen_rtx_REG (V8HImode, REGNO (operands[2]));
> +  rtx insn = gen_ssse3_phwv8hi3 (op0, op1, op2);
> +  emit_insn (insn);
> +  ix86_move_vector_high_sse_to_mmx (op0);
> +  DONE;
> +}
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "sseiadd")
> (set_attr "atom_unit" "complex")
> (set_attr "prefix_extra" "1")
> (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p
> (insn)"))
> -   (set_attr "mode" "DI")])
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_insn "avx2_phdv8si3"
>[(set (match_operand:V8SI 0 "register_operand" "=x")
> --
> 2.20.1
>
>


Re: [PATCH 22/43] i386: Emulate MMX mmx_uavgv8qi3 with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX mmx_uavgv8qi3 with SSE.  Only SSE register source operand is
> allowed.
>
>   PR target/89021
>   * config/i386/mmx.md (mmx_uavgv8qi3): Add SSE emulation support.
>   (*mmx_uavgv8qi3): Add SSE emulation.

Please change insn conditions here and up to patch 25/43.

Uros.

> ---
>  gcc/config/i386/mmx.md | 21 +
>  1 file changed, 13 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index 0e5bfe6baff..38743ea10fd 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -1684,42 +1684,47 @@
> (const_int 1) (const_int 1)
> (const_int 1) (const_int 1)]))
>   (const_int 1]
> -  "TARGET_SSE || TARGET_3DNOW"
> +  "((TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE)
> +   || TARGET_3DNOW"
>"ix86_fixup_binary_operands_no_copy (PLUS, V8QImode, operands);")
>
>  (define_insn "*mmx_uavgv8qi3"
> -  [(set (match_operand:V8QI 0 "register_operand" "=y")
> +  [(set (match_operand:V8QI 0 "register_operand" "=y,x,Yv")
>   (truncate:V8QI
> (lshiftrt:V8HI
>   (plus:V8HI
> (plus:V8HI
>   (zero_extend:V8HI
> -   (match_operand:V8QI 1 "nonimmediate_operand" "%0"))
> +   (match_operand:V8QI 1 "nonimmediate_operand" "%0,0,Yv"))
>   (zero_extend:V8HI
> -   (match_operand:V8QI 2 "nonimmediate_operand" "ym")))
> +   (match_operand:V8QI 2 "nonimmediate_operand" "ym,x,Yv")))
> (const_vector:V8HI [(const_int 1) (const_int 1)
> (const_int 1) (const_int 1)
> (const_int 1) (const_int 1)
> (const_int 1) (const_int 1)]))
>   (const_int 1]
> -  "(TARGET_SSE || TARGET_3DNOW)
> +  "(((TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE)
> +|| TARGET_3DNOW_A)
> && ix86_binary_operator_ok (PLUS, V8QImode, operands)"
>  {
>/* These two instructions have the same operation, but their encoding
>   is different.  Prefer the one that is de facto standard.  */
> -  if (TARGET_SSE || TARGET_3DNOW_A)
> +  if (TARGET_MMX_WITH_SSE && TARGET_AVX)
> +return "vpavgb\t{%2, %1, %0|%0, %1, %2}";
> +  else if (TARGET_SSE || TARGET_3DNOW_A)
>  return "pavgb\t{%2, %0|%0, %2}";
>else
>  return "pavgusb\t{%2, %0|%0, %2}";
>  }
> -  [(set_attr "type" "mmxshft")
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "mmxshft,sseiadd,sseiadd")
> (set (attr "prefix_extra")
>   (if_then_else
> (not (ior (match_test "TARGET_SSE")
>(match_test "TARGET_3DNOW_A")))
> (const_string "1")
> (const_string "*")))
> -   (set_attr "mode" "DI")])
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_expand "mmx_uavgv4hi3"
>[(set (match_operand:V4HI 0 "register_operand")
> --
> 2.20.1
>
>


Re: [PATCH 26/43] i386: Emulate MMX umulv1siv1di3 with SSE2

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX umulv1siv1di3 with SSE2.  Only SSE register source operand
> is allowed.
>
>   PR target/89021
>   * config/i386/mmx.md (sse2_umulv1siv1di3): Add SSE emulation
>   support.
>   (*sse2_umulv1siv1di3): Add SSE2 emulation.

OK.

Uros.

> ---
>  gcc/config/i386/mmx.md | 22 ++
>  1 file changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index 481b987f4a7..dafc6c4dcb8 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -954,24 +954,30 @@
>   (vec_select:V1SI
> (match_operand:V2SI 2 "nonimmediate_operand")
> (parallel [(const_int 0)])]
> -  "TARGET_SSE2"
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE2"
>"ix86_fixup_binary_operands_no_copy (MULT, V2SImode, operands);")
>
>  (define_insn "*sse2_umulv1siv1di3"
> -  [(set (match_operand:V1DI 0 "register_operand" "=y")
> +  [(set (match_operand:V1DI 0 "register_operand" "=y,x,Yv")
>  (mult:V1DI
> (zero_extend:V1DI
>   (vec_select:V1SI
> -   (match_operand:V2SI 1 "nonimmediate_operand" "%0")
> +   (match_operand:V2SI 1 "nonimmediate_operand" "%0,0,Yv")
> (parallel [(const_int 0)])))
> (zero_extend:V1DI
>   (vec_select:V1SI
> -   (match_operand:V2SI 2 "nonimmediate_operand" "ym")
> +   (match_operand:V2SI 2 "nonimmediate_operand" "ym,x,Yv")
> (parallel [(const_int 0)])]
> -  "TARGET_SSE2 && ix86_binary_operator_ok (MULT, V2SImode, operands)"
> -  "pmuludq\t{%2, %0|%0, %2}"
> -  [(set_attr "type" "mmxmul")
> -   (set_attr "mode" "DI")])
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
> +   && TARGET_SSE2
> +   && ix86_binary_operator_ok (MULT, V2SImode, operands)"
> +  "@
> +   pmuludq\t{%2, %0|%0, %2}
> +   pmuludq\t{%2, %0|%0, %2}
> +   vpmuludq\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "mmxmul,ssemul,ssemul")
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_expand "mmx_v4hi3"
>[(set (match_operand:V4HI 0 "register_operand")
> --
> 2.20.1
>
>


Re: [PATCH 20/43] i386: Emulate MMX mmx_umulv4hi3_highpart with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX mmx_umulv4hi3_highpart with SSE.  Only SSE register source
> operand is allowed.
>
>   PR target/89021
>   * config/i386/mmx.md (*mmx_umulv4hi3_highpart): Add SSE emulation.
> ---
>  gcc/config/i386/mmx.md | 19 ---
>  1 file changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index 0d00896127b..0e5bfe6baff 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -817,24 +817,29 @@
> (zero_extend:V4SI
>   (match_operand:V4HI 2 "nonimmediate_operand")))
>   (const_int 16]
> -  "TARGET_SSE || TARGET_3DNOW_A"
> +  "((TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE)
> +   || TARGET_3DNOW_A"

Please change insn condition.

Uros.

>"ix86_fixup_binary_operands_no_copy (MULT, V4HImode, operands);")
>
>  (define_insn "*mmx_umulv4hi3_highpart"
> -  [(set (match_operand:V4HI 0 "register_operand" "=y")
> +  [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv")
>   (truncate:V4HI
> (lshiftrt:V4SI
>   (mult:V4SI
> (zero_extend:V4SI
> - (match_operand:V4HI 1 "nonimmediate_operand" "%0"))
> + (match_operand:V4HI 1 "nonimmediate_operand" "%0,0,Yv"))
> (zero_extend:V4SI
> - (match_operand:V4HI 2 "nonimmediate_operand" "ym")))
> + (match_operand:V4HI 2 "nonimmediate_operand" "ym,x,Yv")))
> (const_int 16]
>"(TARGET_SSE || TARGET_3DNOW_A)
> && ix86_binary_operator_ok (MULT, V4HImode, operands)"
> -  "pmulhuw\t{%2, %0|%0, %2}"
> -  [(set_attr "type" "mmxmul")
> -   (set_attr "mode" "DI")])
> +  "@
> +   pmulhuw\t{%2, %0|%0, %2}
> +   pmulhuw\t{%2, %0|%0, %2}
> +   vpmulhuw\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "mmxmul,ssemul,ssemul")
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_expand "mmx_pmaddwd"
>[(set (match_operand:V2SI 0 "register_operand")
> --
> 2.20.1
>
>


Re: [PATCH 19/43] i386: Emulate MMX mmx_pmovmskb with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX mmx_pmovmskb with SSE by zero-extending result of SSE pmovmskb
> from QImode to SImode.  Only SSE register source operand is allowed.
>
>   PR target/89021
>   * config/i386/mmx.md (mmx_pmovmskb): Changed to
>   define_insn_and_split to support SSE emulation.
> ---
>  gcc/config/i386/mmx.md | 32 +---
>  1 file changed, 25 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index 3390e42ea5b..0d00896127b 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -1761,14 +1761,32 @@
>[(set_attr "type" "mmxshft")
> (set_attr "mode" "DI")])
>
> -(define_insn "mmx_pmovmskb"
> -  [(set (match_operand:SI 0 "register_operand" "=r")
> - (unspec:SI [(match_operand:V8QI 1 "register_operand" "y")]
> +(define_insn_and_split "mmx_pmovmskb"
> +  [(set (match_operand:SI 0 "register_operand" "=r,r")
> + (unspec:SI [(match_operand:V8QI 1 "register_operand" "y,x")]
>  UNSPEC_MOVMSK))]
> -  "TARGET_SSE || TARGET_3DNOW_A"
> -  "pmovmskb\t{%1, %0|%0, %1}"
> -  [(set_attr "type" "mmxcvt")
> -   (set_attr "mode" "DI")])
> +  "((TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE)
> +   || TARGET_3DNOW_A"

Please change insn condition ...

> +  "@
> +   pmovmskb\t{%1, %0|%0, %1}
> +   #"
> +  "&& reload_completed && TARGET_MMX_WITH_SSE"

... and split condition ...

> +  [(const_int 0)]
> +{
> +  /* Generate SSE pmovmskb.  */
> +  rtx op0 = operands[0];
> +  rtx op1 = gen_rtx_REG (V16QImode, REGNO (operands[1]));
> +  rtx insn = gen_sse2_pmovmskb (op0, op1);
> +  emit_insn (insn);
> +  /* Zero-extend from QImode to SImode.  */
> +  op1 = gen_rtx_REG (QImode, REGNO (operands[0]));
> +  insn = gen_zero_extendqisi2 (op0, op1);
> +  emit_insn (insn);
> +  DONE;

... and explicitily write RTX instead of (const_int 0).

Uros.

> +}
> +  [(set_attr "mmx_isa" "native,x64")
> +   (set_attr "type" "mmxcvt,ssemov")
> +   (set_attr "mode" "DI,TI")])
>
>  (define_expand "mmx_maskmovq"
>[(set (match_operand:V8QI 0 "memory_operand")
> --
> 2.20.1
>
>


Re: [PATCH 17/43] i386: Emulate MMX mmx_pinsrw with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX mmx_pinsrw with SSE.  Only SSE register source operand is
> allowed.
>
>   PR target/89021
>   * config/i386/mmx.md (mmx_pinsrw): Add SSE emulation.

Please change insn condition,

Uros.

> ---
>  gcc/config/i386/mmx.md | 30 +-
>  1 file changed, 21 insertions(+), 9 deletions(-)
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index 57669018d0c..5265024c529 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -1321,32 +1321,44 @@
>  (match_operand:SI 2 "nonimmediate_operand"))
> (match_operand:V4HI 1 "register_operand")
>(match_operand:SI 3 "const_0_to_3_operand")))]
> -  "TARGET_SSE || TARGET_3DNOW_A"
> +  "((TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE)
> +   || TARGET_3DNOW_A"
>  {
>operands[2] = gen_lowpart (HImode, operands[2]);
>operands[3] = GEN_INT (1 << INTVAL (operands[3]));
>  })
>
>  (define_insn "*mmx_pinsrw"
> -  [(set (match_operand:V4HI 0 "register_operand" "=y")
> +  [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv")
>  (vec_merge:V4HI
>(vec_duplicate:V4HI
> -(match_operand:HI 2 "nonimmediate_operand" "rm"))
> -   (match_operand:V4HI 1 "register_operand" "0")
> +(match_operand:HI 2 "nonimmediate_operand" "rm,rm,rm"))
> +   (match_operand:V4HI 1 "register_operand" "0,0,Yv")
>(match_operand:SI 3 "const_int_operand")))]
>"(TARGET_SSE || TARGET_3DNOW_A)
> && ((unsigned) exact_log2 (INTVAL (operands[3]))
> < GET_MODE_NUNITS (V4HImode))"
>  {
>operands[3] = GEN_INT (exact_log2 (INTVAL (operands[3])));
> -  if (MEM_P (operands[2]))
> -return "pinsrw\t{%3, %2, %0|%0, %2, %3}";
> +  if (TARGET_MMX_WITH_SSE && TARGET_AVX)
> +{
> +  if (MEM_P (operands[2]))
> + return "vpinsrw\t{%3, %2, %1, %0|%0, %1, %2, %3}";
> +  else
> + return "vpinsrw\t{%3, %k2, %1, %0|%0, %1, %k2, %3}";
> +}
>else
> -return "pinsrw\t{%3, %k2, %0|%0, %k2, %3}";
> +{
> +  if (MEM_P (operands[2]))
> + return "pinsrw\t{%3, %2, %0|%0, %2, %3}";
> +  else
> + return "pinsrw\t{%3, %k2, %0|%0, %k2, %3}";
> +}
>  }
> -  [(set_attr "type" "mmxcvt")
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "mmxcvt,sselog,sselog")
> (set_attr "length_immediate" "1")
> -   (set_attr "mode" "DI")])
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_insn "mmx_pextrw"
>[(set (match_operand:SI 0 "register_operand" "=r,r")
> --
> 2.20.1
>
>


Re: [PATCH] Updated patches for the port of gccgo to GNU/Hurd

2019-02-10 Thread Svante Signell
On Sat, 2019-02-09 at 23:57 +0100, Svante Signell wrote:
> On Sat, 2019-02-09 at 14:40 -0800, Ian Lance Taylor wrote:
> > On Fri, Feb 8, 2019 at 3:07 PM Matthias Klose  wrote:
> > > On 07.02.19 06:04, Ian Lance Taylor wrote:
> > What are the lines before that in the log?  For some reason libtool is
> > being invoke with no source files.  The lines before the failing line
> > should show an invocation of match.sh that determines the source
> > files.
> 
> Thanks for your job upstreaming the patches!
> 
> I've found some problems. Current problem is with the mksysinfo.sh patch. But
> there are some other things missing. New patches will be submitted tomorrow. 

Attached are three additional patches needed to build libgo on GNU/Hurd:
src_libgo_mksysinfo.sh.diff
src_libgo_go_syscall_wait.c.diff
src_libgo_testsuite_gotest.diff

For the first patch, src_libgo_mksysinfo.sh.diff, I had to go back to the old
version, using sed -i -e. As written now ${fsid_to_dev} expands to 
fsid_to_dev='-e '\''s/st_fsid/Dev/'\''' resulting in: "sed: -e expression #4,
char 1: unknown command: `''". Unfortunately, I have not yet been able to modify
the expansion omitting the single qoutes around the shell variable.

The second patch, src_libgo_go_syscall_wait.c.diff, is needed since WCONTINUED
is not defined and is needed for WIFCONTINUED to be defined in wait.h.

The third patch, src_libgo_testsuite_gotest.diff, is not strictly needed, but
running the tests the annoying text is displayed: "ps: comm: Unknown format
spec"

Thanks!
Index: gcc-9-9-20190208/src/libgo/go/syscall/wait.c
===
--- gcc-9-9-20190208.orig/src/libgo/go/syscall/wait.c
+++ gcc-9-9-20190208/src/libgo/go/syscall/wait.c
@@ -8,6 +8,13 @@
OS-independent.  */
 
 #include 
+
+/* WCONTINUED is not defined on GNU/Hurd */
+#ifdef __GNU__
+#ifndef WCONTINUED
+#define WCONTINUED 0
+#endif
+#endif
 #include 
 
 #include "runtime.h"
Index: gcc-9-9-20190208/src/libgo/mksysinfo.sh
===
--- gcc-9-9-20190208.orig/src/libgo/mksysinfo.sh
+++ gcc-9-9-20190208/src/libgo/mksysinfo.sh
@@ -486,9 +486,8 @@ grep '^type _st_timespec ' gen-sysinfo.g
 
 # Special treatment of struct stat st_dev for GNU/Hurd
 # /usr/include/i386-gnu/bits/stat.h: #define st_dev st_fsid
-fsid_to_dev=
 if grep 'define st_dev st_fsid' gen-sysinfo.go > /dev/null 2>&1; then
-  fsid_to_dev="-e 's/st_fsid/Dev/'"
+  sed -i -e 's/; st_fsid/; st_dev/' gen-sysinfo.go
 fi
 
 # The stat type.
@@ -501,7 +500,6 @@ else
 fi | sed -e 's/type _stat64/type Stat_t/' \
  -e 's/type _stat/type Stat_t/' \
  -e 's/st_dev/Dev/' \
- ${fsid_to_dev} \
  -e 's/st_ino/Ino/g' \
  -e 's/st_nlink/Nlink/' \
  -e 's/st_mode/Mode/' \
libgo/ChangeLog

2018-10-20  Svante Signell 
  * libgo/testsuite/gotest: Remove ps -o comm option for GNU/Hurd.

Index: gcc-9-9-20190208/src/libgo/testsuite/gotest
===
--- gcc-9-9-20190208.orig/src/libgo/testsuite/gotest
+++ gcc-9-9-20190208/src/libgo/testsuite/gotest
@@ -650,7 +650,11 @@ xno)
 		wait $pid
 		status=$?
 		if ! test -f gotest-timeout; then
-		sleeppid=`ps -o pid,ppid,comm | grep " $alarmpid " | grep sleep | sed -e 's/ *\([0-9]*\) .*$/\1/'`
+		if test "$goos" = "hurd"; then
+			sleeppid=`ps -o pid,ppid | grep " $alarmpid " | grep sleep | sed -e 's/ *\([0-9]*\) .*$/\1/'`
+		else
+			sleeppid=`ps -o pid,ppid,comm | grep " $alarmpid " | grep sleep | sed -e 's/ *\([0-9]*\) .*$/\1/'`
+		fi
 		kill $alarmpid
 		wait $alarmpid
 		if test "$sleeppid" != ""; then


Re: [PATCH 18/43] i386: Emulate MMX V4HI smaxmin/V8QI umaxmin with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX V4HI smaxmin/V8QI umaxmin with SSE.  Only SSE register source
> operand is allowed.
>
>   PR target/89021
>   * config/i386/mmx.md (mmx_v4hi3): Add SSE emulation
>   support.
>   (mmx_v8qi3): Likewise.
>   (smaxmin:v4hi3): New.
>   (umaxmin:v8qi3): Likewise.
>   (smaxmin:*mmx_v4hi3): Add SSE emulation.
>   (umaxmin:*mmx_v8qi3): Likewise.

Please change insn conditions, as in previous patch.

Uros.

> ---
>  gcc/config/i386/mmx.md | 60 +++---
>  1 file changed, 44 insertions(+), 16 deletions(-)
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index 5265024c529..3390e42ea5b 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -957,38 +957,66 @@
>  (smaxmin:V4HI
> (match_operand:V4HI 1 "nonimmediate_operand")
> (match_operand:V4HI 2 "nonimmediate_operand")))]
> -  "TARGET_SSE || TARGET_3DNOW_A"
> +  "((TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE)
> +   || TARGET_3DNOW_A"
> +  "ix86_fixup_binary_operands_no_copy (, V4HImode, operands);")
> +
> +(define_expand "v4hi3"
> +  [(set (match_operand:V4HI 0 "register_operand")
> +(smaxmin:V4HI
> +   (match_operand:V4HI 1 "nonimmediate_operand")
> +   (match_operand:V4HI 2 "nonimmediate_operand")))]
> +  "TARGET_MMX_WITH_SSE"
>"ix86_fixup_binary_operands_no_copy (, V4HImode, operands);")
>
>  (define_insn "*mmx_v4hi3"
> -  [(set (match_operand:V4HI 0 "register_operand" "=y")
> +  [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv")
>  (smaxmin:V4HI
> -   (match_operand:V4HI 1 "nonimmediate_operand" "%0")
> -   (match_operand:V4HI 2 "nonimmediate_operand" "ym")))]
> -  "(TARGET_SSE || TARGET_3DNOW_A)
> +   (match_operand:V4HI 1 "nonimmediate_operand" "%0,0,Yv")
> +   (match_operand:V4HI 2 "nonimmediate_operand" "ym,x,Yv")))]
> +  "(((TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE)
> +|| TARGET_3DNOW_A)
> && ix86_binary_operator_ok (, V4HImode, operands)"
> -  "pw\t{%2, %0|%0, %2}"
> -  [(set_attr "type" "mmxadd")
> -   (set_attr "mode" "DI")])
> +  "@
> +   pw\t{%2, %0|%0, %2}
> +   pw\t{%2, %0|%0, %2}
> +   vpw\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "mmxadd,sseiadd,sseiadd")
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_expand "mmx_v8qi3"
>[(set (match_operand:V8QI 0 "register_operand")
>  (umaxmin:V8QI
> (match_operand:V8QI 1 "nonimmediate_operand")
> (match_operand:V8QI 2 "nonimmediate_operand")))]
> -  "TARGET_SSE || TARGET_3DNOW_A"
> +  "((TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE)
> +   || TARGET_3DNOW_A"
> +  "ix86_fixup_binary_operands_no_copy (, V8QImode, operands);")
> +
> +(define_expand "v8qi3"
> +  [(set (match_operand:V8QI 0 "register_operand")
> +(umaxmin:V8QI
> +   (match_operand:V8QI 1 "nonimmediate_operand")
> +   (match_operand:V8QI 2 "nonimmediate_operand")))]
> +  "TARGET_MMX_WITH_SSE"
>"ix86_fixup_binary_operands_no_copy (, V8QImode, operands);")
>
>  (define_insn "*mmx_v8qi3"
> -  [(set (match_operand:V8QI 0 "register_operand" "=y")
> +  [(set (match_operand:V8QI 0 "register_operand" "=y,x,Yv")
>  (umaxmin:V8QI
> -   (match_operand:V8QI 1 "nonimmediate_operand" "%0")
> -   (match_operand:V8QI 2 "nonimmediate_operand" "ym")))]
> -  "(TARGET_SSE || TARGET_3DNOW_A)
> +   (match_operand:V8QI 1 "nonimmediate_operand" "%0,0,Yv")
> +   (match_operand:V8QI 2 "nonimmediate_operand" "ym,x,Yv")))]
> +  "(((TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE)
> +|| TARGET_3DNOW_A)
> && ix86_binary_operator_ok (, V8QImode, operands)"
> -  "pb\t{%2, %0|%0, %2}"
> -  [(set_attr "type" "mmxadd")
> -   (set_attr "mode" "DI")])
> +  "@
> +   pb\t{%2, %0|%0, %2}
> +   pb\t{%2, %0|%0, %2}
> +   vpb\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "mmxadd,sseiadd,sseiadd")
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_insn "mmx_ashr3"
>[(set (match_operand:MMXMODE24 0 "register_operand" "=y")
> --
> 2.20.1
>
>


Re: [PATCH 16/43] i386: Emulate MMX mmx_pextrw with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX mmx_pextrw with SSE.  Only SSE register source operand is
> allowed.
>
>   PR target/89021
>   * config/i386/mmx.md (mmx_pextrw): Add SSE emulation.
> ---
>  gcc/config/i386/mmx.md | 16 +---
>  1 file changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index dc81d7f45df..57669018d0c 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -1349,16 +1349,18 @@
> (set_attr "mode" "DI")])
>
>  (define_insn "mmx_pextrw"
> -  [(set (match_operand:SI 0 "register_operand" "=r")
> +  [(set (match_operand:SI 0 "register_operand" "=r,r")
>  (zero_extend:SI
> (vec_select:HI
> - (match_operand:V4HI 1 "register_operand" "y")
> - (parallel [(match_operand:SI 2 "const_0_to_3_operand" "n")]]
> -  "TARGET_SSE || TARGET_3DNOW_A"
> -  "pextrw\t{%2, %1, %0|%0, %1, %2}"
> -  [(set_attr "type" "mmxcvt")
> + (match_operand:V4HI 1 "register_operand" "y,Yv")
> + (parallel [(match_operand:SI 2 "const_0_to_3_operand" "n,n")]]
> +  "((TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE)
> +   || TARGET_3DNOW_A"

(TARGET_MMX  || TARGET_MMX_WITH_SSE) && (TARGET_SSE || TARGET_3DNOW_A)

Uros.

> +  "%vpextrw\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "mmx_isa" "native,x64")
> +   (set_attr "type" "mmxcvt,sselog1")
> (set_attr "length_immediate" "1")
> -   (set_attr "mode" "DI")])
> +   (set_attr "mode" "DI,TI")])
>
>  (define_expand "mmx_pshufw"
>[(match_operand:V4HI 0 "register_operand")
> --
> 2.20.1
>
>


Re: [PATCH 13/43] i386: Emulate MMX pshufw with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX pshufw with SSE.  Only SSE register source operand is allowed.
>
>   PR target/89021
>   * config/i386/mmx.md (mmx_pshufw_1): Add SSE emulation.
>   (*vec_dupv4hi): Likewise.
>   emulation.
> ---
>  gcc/config/i386/mmx.md | 33 +
>  1 file changed, 21 insertions(+), 12 deletions(-)
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index 1ee51c5deb7..dc81d7f45df 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -1364,7 +1364,8 @@
>[(match_operand:V4HI 0 "register_operand")
> (match_operand:V4HI 1 "nonimmediate_operand")
> (match_operand:SI 2 "const_int_operand")]
> -  "TARGET_SSE || TARGET_3DNOW_A"
> +  "((TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE)
> +   || TARGET_3DNOW_A"

I think that the above condition should read

(TARGET_MMX || TARGET_MMX_WITH_SSE) && (TARGET_SSE || TARGET_3DNOW_A)

and with TARGET_MMX_WITH_SSE (which implies SSE2) we always use XMM
registers. Without SSE2, we use MMX registers, as before.

>  {
>int mask = INTVAL (operands[2]);
>emit_insn (gen_mmx_pshufw_1 (operands[0], operands[1],
> @@ -1376,14 +1377,15 @@
>  })
>
>  (define_insn "mmx_pshufw_1"
> -  [(set (match_operand:V4HI 0 "register_operand" "=y")
> +  [(set (match_operand:V4HI 0 "register_operand" "=y,Yv")
>  (vec_select:V4HI
> -  (match_operand:V4HI 1 "nonimmediate_operand" "ym")
> +  (match_operand:V4HI 1 "nonimmediate_operand" "ym,Yv")
>(parallel [(match_operand 2 "const_0_to_3_operand")
>   (match_operand 3 "const_0_to_3_operand")
>   (match_operand 4 "const_0_to_3_operand")
>   (match_operand 5 "const_0_to_3_operand")])))]
> -  "TARGET_SSE || TARGET_3DNOW_A"
> +  "((TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE)
> +   || TARGET_3DNOW_A"
>  {
>int mask = 0;
>mask |= INTVAL (operands[2]) << 0;
> @@ -1392,11 +1394,15 @@
>mask |= INTVAL (operands[5]) << 6;
>operands[2] = GEN_INT (mask);
>
> -  return "pshufw\t{%2, %1, %0|%0, %1, %2}";
> +  if (TARGET_MMX_WITH_SSE)
> +return "%vpshuflw\t{%2, %1, %0|%0, %1, %2}";
> +  else
> +return "pshufw\t{%2, %1, %0|%0, %1, %2}";

The above should be implemented as multi-output template.

>  }
> -  [(set_attr "type" "mmxcvt")
> +  [(set_attr "mmx_isa" "native,x64")
> +   (set_attr "type" "mmxcvt,sselog")
> (set_attr "length_immediate" "1")
> -   (set_attr "mode" "DI")])
> +   (set_attr "mode" "DI,TI")])
>
>  (define_insn "mmx_pswapdv2si2"
>[(set (match_operand:V2SI 0 "register_operand" "=y")
> @@ -1410,15 +1416,18 @@
> (set_attr "mode" "DI")])
>
>  (define_insn "*vec_dupv4hi"
> -  [(set (match_operand:V4HI 0 "register_operand" "=y")
> +  [(set (match_operand:V4HI 0 "register_operand" "=y,Yv")
>   (vec_duplicate:V4HI
> (truncate:HI
> - (match_operand:SI 1 "register_operand" "0"]
> + (match_operand:SI 1 "register_operand" "0,Yv"]
>"TARGET_SSE || TARGET_3DNOW_A"

Here we also need "(TARGET_MMX || TARGET_MMX_WITH_SSE) &&"

Uros.

> -  "pshufw\t{$0, %0, %0|%0, %0, 0}"
> -  [(set_attr "type" "mmxcvt")
> +  "@
> +   pshufw\t{$0, %0, %0|%0, %0, 0}
> +   %vpshuflw\t{$0, %1, %0|%0, %1, 0}"
> +  [(set_attr "mmx_isa" "native,x64")
> +   (set_attr "type" "mmxcvt,sselog1")
> (set_attr "length_immediate" "1")
> -   (set_attr "mode" "DI")])
> +   (set_attr "mode" "DI,TI")])
>
>  (define_insn_and_split "*vec_dupv2si"
>[(set (match_operand:V2SI 0 "register_operand" "=y,x,Yv")
> --
> 2.20.1
>
>


*Ping* [patch, fortran] Fix PR 71237

2019-02-10 Thread Thomas Koenig

Am 06.02.19 um 21:27 schrieb Thomas Koenig:

Hello world,

this patch fixes a 7/8/9 regression where we tried to accept invalid
code, which led to an ICE later on.

The patch is rather straightforward.  The reason why I could not
use gfc_expr_attr is that it does not actually return the
flags the way they can be found in the original attributes;
for example, an expression containing a pointer attribute is
shown as having the target attribute, for reasons I cannot
fathom.

Regression-tested.  OK for trunk and other open branches?


Ping?

And please disregard the ChangeLog entry in the patch :-)

Regards

Thomas


[patch, fortran] Fix part of PR 71066

2019-02-10 Thread Thomas Koenig

Hello world,

this patch fixes the coarray part of PR 71066 - handling of data
statements for coarrays.  The PR itself is marked as a 7/8/9
regression.

Regression-tested.  OK for trunk and for backporting?

Regards

Thomas

2019-02-10  Thomas Koenig   




PR fortran/71066 

* trans-decl.c (generate_coarray_sym_init):  For an array 

constructor in a DATA statement of a coarray variable, set the 

rank to 1 to avoid confusion later on.  If the constructor 

contains only one value, use that for initiailizig. 




2019-02-10  Thomas Koenig   




PR fortran/71066 


* gfortran.dg/coarray_data_1.f90: New test.
Index: trans-decl.c
===
--- trans-decl.c	(Revision 268432)
+++ trans-decl.c	(Arbeitskopie)
@@ -5399,6 +5399,33 @@ generate_coarray_sym_init (gfc_symbol *sym)
   /* Handle "static" initializer.  */
   if (sym->value)
 {
+  if (sym->value->expr_type == EXPR_ARRAY)
+	{
+	  gfc_constructor *c, *cnext;
+
+	  /* Test if the array has more than one element.  */
+	  c = gfc_constructor_first (sym->value->value.constructor);
+	  gcc_assert (c);  /* Empty constructor should not happen here.  */
+	  cnext = gfc_constructor_next (c);
+
+	  if (cnext)
+	{
+	  /* An EXPR_ARRAY with a rank > 1 here has to come from a
+		 DATA statement.  Set its rank here as not to confuse
+		 the following steps.   */
+	  sym->value->rank = 1;
+	}
+	  else
+	{
+	  /* There is only a single value in the constructor, use
+		 it directly for the assignment.  */
+	  gfc_expr *new_expr;
+	  new_expr = gfc_copy_expr (c->expr);
+	  gfc_free_expr (sym->value);
+	  sym->value = new_expr;
+	}
+	}
+
   sym->attr.pointer = 1;
   tmp = gfc_trans_assignment (gfc_lval_expr_from_sym (sym), sym->value,
   true, false);
! { dg-do  run }
! { dg-options "-fcoarray=lib -lcaf_single " }
! PR 71066 - this used to ICE
program p
   real :: a(2,2)[*]
   integer :: b(2,2)[*]
   data a /4*0.0/
   data b /1234, 2345, 3456, 4567/
   if (any (a /= 0.0)) stop 1
   if (any (b /= reshape([1234, 2345, 3456, 4567],[2,2]))) stop 2
end


Re: [Patch] [arm] Fix 88714, Arm LDRD/STRD peepholes

2019-02-10 Thread Jakub Jelinek
On Sun, Feb 10, 2019 at 10:42:55AM +0100, Christophe Lyon wrote:
> > 2019-02-08  Jakub Jelinek  
> >
> > PR bootstrap/88714
> > * config/arm/ldrdstrd.md (*arm_ldrd, *arm_strd): Use q constraint
> > instead of r.
> >
> 
> Both this simple patch or the previous fix all the ICEs I reported, thanks.
> 
> Of course, the scan-assembler failures remain to be fixed.

Thanks.  Is the patch ok for trunk then (which one)?

There is yet another variant I guess, using =q constraint just on the
operand 0, because valid_operands_ldrd_strd requires that the first reg
is even and second one higher and as the only difference between q and r
(CORE_REGS vs. GENERAL_REGS) is the ip register which has regno 12, the
second operand must not be ip anyway.

> > --- gcc/config/arm/ldrdstrd.md.jj   2019-02-08 11:25:42.368916124 +0100
> > +++ gcc/config/arm/ldrdstrd.md  2019-02-08 12:38:33.647585108 +0100
> > @@ -157,9 +157,9 @@ (define_peephole2 ; swap the destination
> >  ;; We use gen_operands_ldrd_strd() with a modify argument as false so that 
> > the
> >  ;; operands are not changed.
> >  (define_insn "*arm_ldrd"
> > -  [(parallel [(set (match_operand:SI 0 "s_register_operand" "=r")
> > +  [(parallel [(set (match_operand:SI 0 "s_register_operand" "=q")
> >(match_operand:SI 2 "memory_operand" "m"))
> > - (set (match_operand:SI 1 "s_register_operand" "=r")
> > + (set (match_operand:SI 1 "s_register_operand" "=q")
> >(match_operand:SI 3 "memory_operand" "m"))])]
> >"TARGET_LDRD && TARGET_ARM && reload_completed
> >&& valid_operands_ldrd_strd (operands, true)"
> > @@ -178,9 +178,9 @@ (define_insn "*arm_ldrd"
> >
> >  (define_insn "*arm_strd"
> >[(parallel [(set (match_operand:SI 2 "memory_operand" "=m")
> > -  (match_operand:SI 0 "s_register_operand" "r"))
> > +  (match_operand:SI 0 "s_register_operand" "q"))
> >   (set (match_operand:SI 3 "memory_operand" "=m")
> > -  (match_operand:SI 1 "s_register_operand" "r"))])]
> > +  (match_operand:SI 1 "s_register_operand" "q"))])]
> >"TARGET_LDRD && TARGET_ARM && reload_completed
> >&& valid_operands_ldrd_strd (operands, false)"
> >{

Jakub


Re: [PATCH 15/43] i386: Emulate MMX sse_cvtpi2ps with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX sse_cvtpi2ps with SSE2 cvtdq2ps, preserving upper 64 bits of
> destination XMM register.  Only SSE register source operand is allowed.
>
>   PR target/89021
>   * config/i386/mmx.md (UNSPEC_CVTPI2PS): New.
>   (sse_cvtpi2ps): Renamed to ...
>   (*mmx_cvtpi2ps): This.  Disabled for TARGET_MMX_WITH_SSE.
>   (sse_cvtpi2ps): New.
>   (mmx_cvtpi2ps_sse): Likewise.

LGTM.

Uros.

> ---
>  gcc/config/i386/sse.md | 83 +-
>  1 file changed, 81 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index ad31ac3d9e6..4be5ab44e81 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -18,6 +18,9 @@
>  ;; .
>
>  (define_c_enum "unspec" [
> +  ;; MMX with SSE
> +  UNSPEC_CVTPI2PS
> +
>;; SSE
>UNSPEC_MOVNT
>
> @@ -4655,14 +4658,90 @@
>  ;;
>  ;
>
> -(define_insn "sse_cvtpi2ps"
> +(define_expand "sse_cvtpi2ps"
> +  [(set (match_operand:V4SF 0 "register_operand")
> + (vec_merge:V4SF
> +   (vec_duplicate:V4SF
> + (float:V2SF (match_operand:V2SI 2 "nonimmediate_operand")))
> +   (match_operand:V4SF 1 "register_operand")
> +   (const_int 3)))]
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE"
> +{
> +  if (TARGET_MMX_WITH_SSE)
> +{
> +  rtx op2 = force_reg (V2SImode, operands[2]);
> +  rtx op3 = gen_reg_rtx (V4SFmode);
> +  rtx op4 = gen_reg_rtx (V4SFmode);
> +  rtx insn = gen_mmx_cvtpi2ps_sse (operands[0], operands[1], op2,
> +op3, op4);
> +  emit_insn (insn);
> +  DONE;
> +}
> +})
> +
> +(define_insn_and_split "mmx_cvtpi2ps_sse"
> +  [(set (match_operand:V4SF 0 "register_operand" "=x,Yv")
> + (unspec:V4SF [(match_operand:V2SI 2 "register_operand" "x,Yv")
> +   (match_operand:V4SF 1 "register_operand" "0,Yv")]
> +  UNSPEC_CVTPI2PS))
> +   (set (match_operand:V4SF 3 "register_operand" "=x,Yv")
> + (unspec:V4SF [(match_operand:V4SF 4 "register_operand" "3,3")]
> +  UNSPEC_CVTPI2PS))]
> +  "TARGET_MMX_WITH_SSE"
> +  "#"
> +  "&& reload_completed"
> +  [(const_int 0)]
> +{
> +  rtx op2 = gen_rtx_REG (V4SImode, REGNO (operands[2]));
> +  /* Generate SSE2 cvtdq2ps.  */
> +  rtx insn = gen_floatv4siv4sf2 (operands[3], op2);
> +  emit_insn (insn);
> +
> +  /* Merge operands[3] with operands[0].  */
> +  rtx mask, op1;
> +  if (TARGET_AVX)
> +{
> +  mask = gen_rtx_PARALLEL (VOIDmode,
> +gen_rtvec (4, GEN_INT (0), GEN_INT (1),
> +   GEN_INT (6), GEN_INT (7)));
> +  op1 = gen_rtx_VEC_CONCAT (V8SFmode, operands[3], operands[1]);
> +  op2 = gen_rtx_VEC_SELECT (V4SFmode, op1, mask);
> +  insn = gen_rtx_SET (operands[0], op2);
> +}
> +  else
> +{
> +  /* NB: SSE can only concatenate OP0 and OP3 to OP0.  */
> +  mask = gen_rtx_PARALLEL (VOIDmode,
> +gen_rtvec (4, GEN_INT (2), GEN_INT (3),
> +   GEN_INT (4), GEN_INT (5)));
> +  op1 = gen_rtx_VEC_CONCAT (V8SFmode, operands[0], operands[3]);
> +  op2 = gen_rtx_VEC_SELECT (V4SFmode, op1, mask);
> +  insn = gen_rtx_SET (operands[0], op2);
> +  emit_insn (insn);
> +
> +  /* Swap bits 0:63 with bits 64:127.  */
> +  mask = gen_rtx_PARALLEL (VOIDmode,
> +gen_rtvec (4, GEN_INT (2), GEN_INT (3),
> +   GEN_INT (0), GEN_INT (1)));
> +  rtx dest = gen_rtx_REG (V4SImode, REGNO (operands[0]));
> +  op1 = gen_rtx_VEC_SELECT (V4SImode, dest, mask);
> +  insn = gen_rtx_SET (dest, op1);
> +}
> +  emit_insn (insn);
> +  DONE;
> +}
> +  [(set_attr "isa" "noavx,avx")
> +   (set_attr "type" "ssecvt")
> +   (set_attr "mode" "V4SF")])
> +
> +(define_insn "*mmx_cvtpi2ps"
>[(set (match_operand:V4SF 0 "register_operand" "=x")
>   (vec_merge:V4SF
> (vec_duplicate:V4SF
>   (float:V2SF (match_operand:V2SI 2 "nonimmediate_operand" "ym")))
> (match_operand:V4SF 1 "register_operand" "0")
> (const_int 3)))]
> -  "TARGET_SSE"
> +  "TARGET_SSE && !TARGET_MMX_WITH_SSE"
>"cvtpi2ps\t{%2, %0|%0, %2}"
>[(set_attr "type" "ssecvt")
> (set_attr "mode" "V4SF")])
> --
> 2.20.1
>
>


Re: [PATCH 14/43] i386: Emulate MMX sse_cvtps2pi/sse_cvttps2pi with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX sse_cvtps2pi/sse_cvttps2pi with SSE.
>
>   PR target/89021
>   * config/i386/mmx.md (sse_cvtps2pi): Add SSE emulation.
>   (sse_cvttps2pi): Likewise.

It looks to me that this description is wrong. We don't have V4SF
modes here, but V2SF, so we have to fake 64bit load in case of MMX.
The cvtps2dq will access memory in true 128bit width, so this is
wrong.

We have to fix the description to not fake wide mode.

Uros.

> ---
>  gcc/config/i386/sse.md | 30 ++
>  1 file changed, 18 insertions(+), 12 deletions(-)
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 7d2c0367911..ad31ac3d9e6 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -4668,26 +4668,32 @@
> (set_attr "mode" "V4SF")])
>
>  (define_insn "sse_cvtps2pi"
> -  [(set (match_operand:V2SI 0 "register_operand" "=y")
> +  [(set (match_operand:V2SI 0 "register_operand" "=y,Yv")
>   (vec_select:V2SI
> -   (unspec:V4SI [(match_operand:V4SF 1 "nonimmediate_operand" "xm")]
> +   (unspec:V4SI [(match_operand:V4SF 1 "nonimmediate_operand" "xm,YvBm")]
>  UNSPEC_FIX_NOTRUNC)
> (parallel [(const_int 0) (const_int 1)])))]
> -  "TARGET_SSE"
> -  "cvtps2pi\t{%1, %0|%0, %q1}"
> -  [(set_attr "type" "ssecvt")
> -   (set_attr "unit" "mmx")
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE"
> +  "@
> +   cvtps2pi\t{%1, %0|%0, %q1}
> +   %vcvtps2dq\t{%1, %0|%0, %1}"
> +  [(set_attr "mmx_isa" "native,x64")
> +   (set_attr "type" "ssecvt")
> +   (set_attr "unit" "mmx,*")
> (set_attr "mode" "DI")])
>
>  (define_insn "sse_cvttps2pi"
> -  [(set (match_operand:V2SI 0 "register_operand" "=y")
> +  [(set (match_operand:V2SI 0 "register_operand" "=y,Yv")
>   (vec_select:V2SI
> -   (fix:V4SI (match_operand:V4SF 1 "nonimmediate_operand" "xm"))
> +   (fix:V4SI (match_operand:V4SF 1 "nonimmediate_operand" "xm,YvBm"))
> (parallel [(const_int 0) (const_int 1)])))]
> -  "TARGET_SSE"
> -  "cvttps2pi\t{%1, %0|%0, %q1}"
> -  [(set_attr "type" "ssecvt")
> -   (set_attr "unit" "mmx")
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE"
> +  "@
> +   cvttps2pi\t{%1, %0|%0, %q1}
> +   %vcvttps2dq\t{%1, %0|%0, %1}"
> +  [(set_attr "mmx_isa" "native,x64")
> +   (set_attr "type" "ssecvt")
> +   (set_attr "unit" "mmx,*")
> (set_attr "prefix_rep" "0")
> (set_attr "mode" "SF")])
>
> --
> 2.20.1
>
>


Re: Fix odr ICE on Ada LTO

2019-02-10 Thread Jan Hubicka
> 
> This caused:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89272

My apologizes for that. Fixed by the attached patch. This is about ICE
with -fno-lto-odr-type-merging which is option I think we should drop
(probably next stage1 but if it shows to cause troubles, I would not be
against dropping it from gcc 9)

It is not useful and only leads to code duplication.

lto-bootstrapped/regtested x86_64-linux, comitted.

Honza

Index: ChangeLog
===
--- ChangeLog   (revision 267609)
+++ ChangeLog   (working copy)
@@ -1,4 +1,14 @@
+2019-01-03  Jan Hubicka  
+   Backport from mainline
+   2018-08-29  Jan Hubicka  
+
+   PR lto/86517
+   PR lto/88185
+   * lto-opts.c (lto_write_options): Always stream PIC/PIE mode.
+   * lto-wrapper.c (merge_and_complain): Fix merging of PIC/PIE.
+
 2019-01-04  Aaron Sawdey  
+
Backport from mainline
2018-11-28  Aaron Sawdey  
 
Index: lto-opts.c
===
--- lto-opts.c  (revision 267609)
+++ lto-opts.c  (working copy)
@@ -78,6 +78,21 @@ lto_write_options (void)
   && !global_options.x_flag_openacc)
 append_to_collect_gcc_options (_obstack, _p,
   "-fno-openacc");
+  /* Append PIC/PIE mode because its default depends on target and it is
+ subject of merging in lto-wrapper.  */
+  if (!global_options_set.x_flag_pic && !global_options_set.x_flag_pie)
+{
+   append_to_collect_gcc_options (_obstack, _p,
+ global_options.x_flag_pic == 2
+ ? "-fPIC"
+ : global_options.x_flag_pic == 1
+ ? "-fpic"
+ : global_options.x_flag_pie == 2
+ ? "-fPIE"
+ : global_options.x_flag_pie == 1
+ ? "-fpie"
+ : "-fno-pie");
+}
 
   /* Append options from target hook and store them to offload_lto section.  */
   if (lto_stream_offload_p)
Index: lto-wrapper.c
===
--- lto-wrapper.c   (revision 267609)
+++ lto-wrapper.c   (working copy)
@@ -408,6 +408,11 @@ merge_and_complain (struct cl_decoded_op
  It is a common mistake to mix few -fPIC compiled objects into otherwise
  non-PIC code.  We do not want to build everything with PIC then.
 
+ Similarly we merge PIE options, however in addition we keep
+  -fPIC + -fPIE = -fPIE
+  -fpic + -fPIE = -fpie
+  -fPIC/-fpic + -fpie = -fpie
+
  It would be good to warn on mismatches, but it is bit hard to do as
  we do not know what nothing translates to.  */
 
@@ -415,11 +420,38 @@ merge_and_complain (struct cl_decoded_op
 if ((*decoded_options)[j].opt_index == OPT_fPIC
 || (*decoded_options)[j].opt_index == OPT_fpic)
   {
-   if (!pic_option
-   || (pic_option->value > 0) != ((*decoded_options)[j].value > 0))
- remove_option (decoded_options, j, decoded_options_count);
-   else if (pic_option->opt_index == OPT_fPIC
-&& (*decoded_options)[j].opt_index == OPT_fpic)
+   /* -fno-pic in one unit implies -fno-pic everywhere.  */
+   if ((*decoded_options)[j].value == 0)
+ j++;
+   /* If we have no pic option or merge in -fno-pic, we still may turn
+  existing pic/PIC mode into pie/PIE if -fpie/-fPIE is present.  */
+   else if ((pic_option && pic_option->value == 0)
+|| !pic_option)
+ {
+   if (pie_option)
+ {
+   bool big = (*decoded_options)[j].opt_index == OPT_fPIC
+  && pie_option->opt_index == OPT_fPIE;
+   (*decoded_options)[j].opt_index = big ? OPT_fPIE : OPT_fpie;
+   if (pie_option->value)
+ (*decoded_options)[j].canonical_option[0] = big ? "-fPIE" : 
"-fpie";
+   else
+ (*decoded_options)[j].canonical_option[0] = big ? "-fno-pie" 
: "-fno-pie";
+   (*decoded_options)[j].value = pie_option->value;
+   j++;
+ }
+   else if (pic_option)
+ {
+   (*decoded_options)[j] = *pic_option;
+   j++;
+ }
+   /* We do not know if target defaults to pic or not, so just remove
+  option if it is missing in one unit but enabled in other.  */
+   else
+ remove_option (decoded_options, j, decoded_options_count);
+ }
+   else if (pic_option->opt_index == OPT_fpic
+&& (*decoded_options)[j].opt_index == OPT_fPIC)
  {
(*decoded_options)[j] = *pic_option;
j++;
@@ -430,11 +462,42 @@ merge_and_complain (struct cl_decoded_op

Re: [PATCH 12/43] i386: Emulate MMX vec_dupv2si with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX vec_dupv2si with SSE.  Only SSE register source operand is
> allowed.
>
>   PR target/89021
>   * config/i386/mmx.md (*vec_dupv2si): Changed to
>   define_insn_and_split and also allow TARGET_MMX_WITH_SSE to
>   support SSE emulation.
>   * config/i386/sse.md (*vec_dupv4si): Renamed to ...
>   (vec_dupv4si): This.
> ---
>  gcc/config/i386/mmx.md | 27 ---
>  gcc/config/i386/sse.md |  2 +-
>  2 files changed, 21 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index d360e97c98b..1ee51c5deb7 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -1420,14 +1420,27 @@
> (set_attr "length_immediate" "1")
> (set_attr "mode" "DI")])
>
> -(define_insn "*vec_dupv2si"
> -  [(set (match_operand:V2SI 0 "register_operand" "=y")
> +(define_insn_and_split "*vec_dupv2si"
> +  [(set (match_operand:V2SI 0 "register_operand" "=y,x,Yv")
>   (vec_duplicate:V2SI
> -   (match_operand:SI 1 "register_operand" "0")))]
> -  "TARGET_MMX"
> -  "punpckldq\t%0, %0"
> -  [(set_attr "type" "mmxcvt")
> -   (set_attr "mode" "DI")])
> +   (match_operand:SI 1 "register_operand" "0,0,Yv")))]
> +  "TARGET_MMX || TARGET_MMX_WITH_SSE"
> +  "@
> +   punpckldq\t%0, %0
> +   #
> +   #"
> +  "&& reload_completed && TARGET_MMX_WITH_SSE"

Please fix above.

> +  [(const_int 0)]
> +{
> +  /* Emulate MMX vec_dupv2si with SSE vec_dupv4si.  */
> +  rtx op0 = gen_rtx_REG (V4SImode, REGNO (operands[0]));
> +  rtx insn = gen_vec_dupv4si (op0, operands[1]);
> +  emit_insn (insn);
> +  DONE;

Please write this simple RTX explicitly in the place of (const_int 0) above.

Uros.

> +}
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "mmxcvt,ssemov,ssemov")
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_insn "*mmx_concatv2si"
>[(set (match_operand:V2SI 0 "register_operand" "=y,y")
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 5dc0930ac1f..7d2c0367911 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -18976,7 +18976,7 @@
> (set_attr "prefix" "maybe_evex,maybe_evex,orig")
> (set_attr "mode" "V4SF")])
>
> -(define_insn "*vec_dupv4si"
> +(define_insn "vec_dupv4si"
>[(set (match_operand:V4SI 0 "register_operand" "=v,v,x")
>   (vec_duplicate:V4SI
> (match_operand:SI 1 "nonimmediate_operand" "Yv,m,0")))]
> --
> 2.20.1
>
>


Re: [PATCH 11/43] i386: Emulate MMX mmx_eq/mmx_gt3 with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX mmx_eq/mmx_gt3 with SSE.  Only SSE register source
> operand is allowed.
>
>   PR target/89021
>   * config/i386/mmx.md (mmx_eq3): Also allow
>   TARGET_MMX_WITH_SSE.
>   (*mmx_eq3): Also allow TARGET_MMX_WITH_SSE.  Add SSE
>   support.
>   (mmx_gt3): Likewise.

OK.

Uros.

> ---
>  gcc/config/i386/mmx.md | 39 ---
>  1 file changed, 24 insertions(+), 15 deletions(-)
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index 8945ece2a03..d360e97c98b 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -1063,28 +1063,37 @@
>  (eq:MMXMODEI
> (match_operand:MMXMODEI 1 "nonimmediate_operand")
> (match_operand:MMXMODEI 2 "nonimmediate_operand")))]
> -  "TARGET_MMX"
> +  "TARGET_MMX || TARGET_MMX_WITH_SSE"
>"ix86_fixup_binary_operands_no_copy (EQ, mode, operands);")
>
>  (define_insn "*mmx_eq3"
> -  [(set (match_operand:MMXMODEI 0 "register_operand" "=y")
> +  [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,Yv")
>  (eq:MMXMODEI
> -   (match_operand:MMXMODEI 1 "nonimmediate_operand" "%0")
> -   (match_operand:MMXMODEI 2 "nonimmediate_operand" "ym")))]
> -  "TARGET_MMX && ix86_binary_operator_ok (EQ, mode, operands)"
> -  "pcmpeq\t{%2, %0|%0, %2}"
> -  [(set_attr "type" "mmxcmp")
> -   (set_attr "mode" "DI")])
> +   (match_operand:MMXMODEI 1 "nonimmediate_operand" "%0,0,Yv")
> +   (match_operand:MMXMODEI 2 "nonimmediate_operand" "ym,x,Yv")))]
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
> +   && ix86_binary_operator_ok (EQ, mode, operands)"
> +  "@
> +   pcmpeq\t{%2, %0|%0, %2}
> +   pcmpeq\t{%2, %0|%0, %2}
> +   vpcmpeq\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "mmxcmp,ssecmp,ssecmp")
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_insn "mmx_gt3"
> -  [(set (match_operand:MMXMODEI 0 "register_operand" "=y")
> +  [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,Yv")
>  (gt:MMXMODEI
> -   (match_operand:MMXMODEI 1 "register_operand" "0")
> -   (match_operand:MMXMODEI 2 "nonimmediate_operand" "ym")))]
> -  "TARGET_MMX"
> -  "pcmpgt\t{%2, %0|%0, %2}"
> -  [(set_attr "type" "mmxcmp")
> -   (set_attr "mode" "DI")])
> +   (match_operand:MMXMODEI 1 "register_operand" "0,0,Yv")
> +   (match_operand:MMXMODEI 2 "nonimmediate_operand" "ym,x,Yv")))]
> +  "TARGET_MMX || TARGET_MMX_WITH_SSE"
> +  "@
> +   pcmpgt\t{%2, %0|%0, %2}
> +   pcmpgt\t{%2, %0|%0, %2}
> +   vpcmpgt\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "mmxcmp,ssecmp,ssecmp")
> +   (set_attr "mode" "DI,TI,TI")])
>
>  ;
>  ;;
> --
> 2.20.1
>
>


Re: [PATCH 09/43] i386: Emulate MMX 3 with SSE

2019-02-10 Thread Uros Bizjak
On 2/9/19, H.J. Lu  wrote:
> Emulate MMX 3 with SSE.  Only SSE register source
> operand is allowed.
>
>   PR target/89021
>   * config/i386/mmx.md (any_logic:3): New.
>   (any_logic:*mmx_3): Also allow TARGET_MMX_WITH_SSE.
>   Add SSE support.

OK.

Uros.

> ---
>  gcc/config/i386/mmx.md | 27 ---
>  1 file changed, 20 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index 69c66e968b5..fae2e43af24 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -1110,15 +1110,28 @@
>"TARGET_MMX"
>"ix86_fixup_binary_operands_no_copy (, mode, operands);")
>
> +(define_expand "3"
> +  [(set (match_operand:MMXMODEI 0 "register_operand")
> + (any_logic:MMXMODEI
> +   (match_operand:MMXMODEI 1 "nonimmediate_operand")
> +   (match_operand:MMXMODEI 2 "nonimmediate_operand")))]
> +  "TARGET_MMX_WITH_SSE"
> +  "ix86_fixup_binary_operands_no_copy (, mode, operands);")
> +
>  (define_insn "*mmx_3"
> -  [(set (match_operand:MMXMODEI 0 "register_operand" "=y")
> +  [(set (match_operand:MMXMODEI 0 "register_operand" "=y,Yx,Yy")
>  (any_logic:MMXMODEI
> -   (match_operand:MMXMODEI 1 "nonimmediate_operand" "%0")
> -   (match_operand:MMXMODEI 2 "nonimmediate_operand" "ym")))]
> -  "TARGET_MMX && ix86_binary_operator_ok (, mode, operands)"
> -  "p\t{%2, %0|%0, %2}"
> -  [(set_attr "type" "mmxadd")
> -   (set_attr "mode" "DI")])
> +   (match_operand:MMXMODEI 1 "nonimmediate_operand" "%0,0,Yy")
> +   (match_operand:MMXMODEI 2 "nonimmediate_operand" "ym,Yx,Yy")))]
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
> +   && ix86_binary_operator_ok (, mode, operands)"
> +  "@
> +   p\t{%2, %0|%0, %2}
> +   p\t{%2, %0|%0, %2}
> +   vp\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "mmxadd,sselog,sselog")
> +   (set_attr "mode" "DI,TI,TI")])
>
>  ;
>  ;;
> --
> 2.20.1
>
>


Re: [PATCH 10/43] i386: Emulate MMX mmx_andnot3 with SSE

2019-02-10 Thread Uros Bizjak
On 2/9/19, H.J. Lu  wrote:
> Emulate MMX mmx_andnot3 with SSE.  Only SSE register source operand
> is allowed.
>
>   PR target/89021
>   * config/i386/mmx.md (mmx_andnot3): Also allow
>   TARGET_MMX_WITH_SSE.  Add SSE support.

OK.

Uros.

> ---
>  gcc/config/i386/mmx.md | 18 +++---
>  1 file changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index fae2e43af24..1e235bfcde4 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -1093,14 +1093,18 @@
>  ;
>
>  (define_insn "mmx_andnot3"
> -  [(set (match_operand:MMXMODEI 0 "register_operand" "=y")
> +  [(set (match_operand:MMXMODEI 0 "register_operand" "=y,Yx,Yy")
>   (and:MMXMODEI
> -   (not:MMXMODEI (match_operand:MMXMODEI 1 "register_operand" "0"))
> -   (match_operand:MMXMODEI 2 "nonimmediate_operand" "ym")))]
> -  "TARGET_MMX"
> -  "pandn\t{%2, %0|%0, %2}"
> -  [(set_attr "type" "mmxadd")
> -   (set_attr "mode" "DI")])
> +   (not:MMXMODEI (match_operand:MMXMODEI 1 "register_operand" "0,0,Yy"))
> +   (match_operand:MMXMODEI 2 "nonimmediate_operand" "ym,Yx,Yy")))]
> +  "TARGET_MMX || TARGET_MMX_WITH_SSE"
> +  "@
> +   pandn\t{%2, %0|%0, %2}
> +   pandn\t{%2, %0|%0, %2}
> +   vpandn\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "mmxadd,sselog,sselog")
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_expand "mmx_3"
>[(set (match_operand:MMXMODEI 0 "register_operand")
> --
> 2.20.1
>
>


Re: [PATCH 08/43] i386: Emulate MMX ashr3/3 with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX ashr3/3 with SSE.  Only SSE register
> source operand is allowed.
>
>   PR target/89021
>   * config/i386/mmx.md (mmx_ashr3): Disallow with
>   TARGET_MMX_WITH_SSE.
>   (mmx_3): Likewise.
>   (ashr3): New.
>   (3): Likewise.

Please merge patterns use mmx_isa attribute.

Uros.

> ---
>  gcc/config/i386/mmx.md | 38 --
>  1 file changed, 36 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index 2024c75fa78..9e07bf31f81 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -995,7 +995,7 @@
>  (ashiftrt:MMXMODE24
> (match_operand:MMXMODE24 1 "register_operand" "0")
> (match_operand:DI 2 "nonmemory_operand" "yN")))]
> -  "TARGET_MMX"
> +  "TARGET_MMX && !TARGET_MMX_WITH_SSE"
>"psra\t{%2, %0|%0, %2}"
>[(set_attr "type" "mmxshft")
> (set (attr "length_immediate")
> @@ -1009,7 +1009,7 @@
>  (any_lshift:MMXMODE248
> (match_operand:MMXMODE248 1 "register_operand" "0")
> (match_operand:DI 2 "nonmemory_operand" "yN")))]
> -  "TARGET_MMX"
> +  "TARGET_MMX && !TARGET_MMX_WITH_SSE"
>"p\t{%2, %0|%0, %2}"
>[(set_attr "type" "mmxshft")
> (set (attr "length_immediate")
> @@ -1018,6 +1018,40 @@
> (const_string "0")))
> (set_attr "mode" "DI")])
>
> +(define_insn "ashr3"
> +  [(set (match_operand:MMXMODE24 0 "register_operand" "=x,Yv")
> +(ashiftrt:MMXMODE24
> +   (match_operand:MMXMODE24 1 "register_operand" "0,Yv")
> +   (match_operand:DI 2 "nonmemory_operand" "xN,YvN")))]
> +  "TARGET_MMX_WITH_SSE"
> +  "@
> +   psra\t{%2, %0|%0, %2}
> +   vpsra\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "isa" "noavx,avx")
> +   (set_attr "type" "sseishft,sseishft")
> +   (set (attr "length_immediate")
> + (if_then_else (match_operand 2 "const_int_operand")
> +   (const_string "1")
> +   (const_string "0")))
> +   (set_attr "mode" "TI")])
> +
> +(define_insn "3"
> +  [(set (match_operand:MMXMODE248 0 "register_operand" "=x,Yv")
> +(any_lshift:MMXMODE248
> +   (match_operand:MMXMODE248 1 "register_operand" "0,Yv")
> +   (match_operand:DI 2 "nonmemory_operand" "xN,YvN")))]
> +  "TARGET_MMX_WITH_SSE"
> +  "@
> +   p\t{%2, %0|%0, %2}
> +   vp\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "isa" "noavx,avx")
> +   (set_attr "type" "sseishft,sseishft")
> +   (set (attr "length_immediate")
> + (if_then_else (match_operand 2 "const_int_operand")
> +   (const_string "1")
> +   (const_string "0")))
> +   (set_attr "mode" "TI")])
> +
>  ;
>  ;;
>  ;; Parallel integral comparisons
> --
> 2.20.1
>
>


Re: [PATCH 07/43] i386: Emulate MMX mmx_pmaddwd with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX pmaddwd with SSE.  Only SSE register source operand is
> allowed.
>
>   PR target/89021
>   * config/i386/mmx.md (mmx_pmaddwd): Also allow TARGET_MMX_WITH_SSE.
>   (*mmx_pmaddwd): Also allow TARGET_MMX_WITH_SSE.  Add SSE support.

OK.

Uros.

> ---
>  gcc/config/i386/mmx.md | 21 +
>  1 file changed, 13 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index 82ca8719492..2024c75fa78 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -855,20 +855,20 @@
>   (sign_extend:V2SI
> (vec_select:V2HI (match_dup 2)
>   (parallel [(const_int 1) (const_int 3)]))]
> -  "TARGET_MMX"
> +  "TARGET_MMX || TARGET_MMX_WITH_SSE"
>"ix86_fixup_binary_operands_no_copy (MULT, V4HImode, operands);")
>
>  (define_insn "*mmx_pmaddwd"
> -  [(set (match_operand:V2SI 0 "register_operand" "=y")
> +  [(set (match_operand:V2SI 0 "register_operand" "=y,x,Yv")
>  (plus:V2SI
> (mult:V2SI
>   (sign_extend:V2SI
> (vec_select:V2HI
> - (match_operand:V4HI 1 "nonimmediate_operand" "%0")
> + (match_operand:V4HI 1 "nonimmediate_operand" "%0,0,Yv")
>   (parallel [(const_int 0) (const_int 2)])))
>   (sign_extend:V2SI
> (vec_select:V2HI
> - (match_operand:V4HI 2 "nonimmediate_operand" "ym")
> + (match_operand:V4HI 2 "nonimmediate_operand" "ym,x,Yv")
>   (parallel [(const_int 0) (const_int 2)]
> (mult:V2SI
>   (sign_extend:V2SI
> @@ -877,10 +877,15 @@
>   (sign_extend:V2SI
> (vec_select:V2HI (match_dup 2)
>   (parallel [(const_int 1) (const_int 3)]))]
> -  "TARGET_MMX && ix86_binary_operator_ok (MULT, V4HImode, operands)"
> -  "pmaddwd\t{%2, %0|%0, %2}"
> -  [(set_attr "type" "mmxmul")
> -   (set_attr "mode" "DI")])
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
> +   && ix86_binary_operator_ok (MULT, V4HImode, operands)"
> +  "@
> +   pmaddwd\t{%2, %0|%0, %2}
> +   pmaddwd\t{%2, %0|%0, %2}
> +   vpmaddwd\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "mmxmul,sseiadd,sseiadd")
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_expand "mmx_pmulhrwv4hi3"
>[(set (match_operand:V4HI 0 "register_operand")
> --
> 2.20.1
>
>


Re: [PATCH 06/43] i386: Emulate MMX smulv4hi3_highpart with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX mulv4hi3 with SSE.  Only SSE register source operand is
> allowed.
>
>   PR target/89021
>   * config/i386/mmx.md (mmx_smulv4hi3_highpart): Also allow
>   TARGET_MMX_WITH_SSE.
>   (*mmx_smulv4hi3_highpart): Also allow TARGET_MMX_WITH_SSE. Add
>   SSE support.

OK.

Uros.

> ---
>  gcc/config/i386/mmx.md | 21 +
>  1 file changed, 13 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index e3b3ab09012..82ca8719492 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -784,23 +784,28 @@
> (sign_extend:V4SI
>   (match_operand:V4HI 2 "nonimmediate_operand")))
>   (const_int 16]
> -  "TARGET_MMX"
> +  "TARGET_MMX || TARGET_MMX_WITH_SSE"
>"ix86_fixup_binary_operands_no_copy (MULT, V4HImode, operands);")
>
>  (define_insn "*mmx_smulv4hi3_highpart"
> -  [(set (match_operand:V4HI 0 "register_operand" "=y")
> +  [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv")
>   (truncate:V4HI
> (lshiftrt:V4SI
>   (mult:V4SI
> (sign_extend:V4SI
> - (match_operand:V4HI 1 "nonimmediate_operand" "%0"))
> + (match_operand:V4HI 1 "nonimmediate_operand" "%0,0,Yv"))
> (sign_extend:V4SI
> - (match_operand:V4HI 2 "nonimmediate_operand" "ym")))
> + (match_operand:V4HI 2 "nonimmediate_operand" "ym,x,Yv")))
>   (const_int 16]
> -  "TARGET_MMX && ix86_binary_operator_ok (MULT, V4HImode, operands)"
> -  "pmulhw\t{%2, %0|%0, %2}"
> -  [(set_attr "type" "mmxmul")
> -   (set_attr "mode" "DI")])
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
> +   && ix86_binary_operator_ok (MULT, V4HImode, operands)"
> +  "@
> +   pmulhw\t{%2, %0|%0, %2}
> +   pmulhw\t{%2, %0|%0, %2}
> +   vpmulhw\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "mmxmul,ssemul,ssemul")
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_expand "mmx_umulv4hi3_highpart"
>[(set (match_operand:V4HI 0 "register_operand")
> --
> 2.20.1
>
>


Re: [PATCH 05/43] i386: Emulate MMX mulv4hi3 with SSE

2019-02-10 Thread Uros Bizjak
On 2/9/19, H.J. Lu  wrote:
> Emulate MMX mulv4hi3 with SSE.  Only SSE register source operand is
> allowed.
>
>   PR target/89021
>   * config/i386/mmx.md (mulv4hi3): New.
>   (*mmx_mulv4hi3): Also allow TARGET_MMX_WITH_SSE.  Add SSE
>   support.

OK.

Uros.

> ---
>  gcc/config/i386/mmx.md | 26 +++---
>  1 file changed, 19 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index 01a71aa128b..2712a86ea3c 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -753,14 +753,26 @@
>"TARGET_MMX"
>"ix86_fixup_binary_operands_no_copy (MULT, V4HImode, operands);")
>
> +(define_expand "mulv4hi3"
> +  [(set (match_operand:V4HI 0 "register_operand")
> +(mult:V4HI (match_operand:V4HI 1 "nonimmediate_operand")
> +(match_operand:V4HI 2 "nonimmediate_operand")))]
> +  "TARGET_MMX_WITH_SSE"
> +  "ix86_fixup_binary_operands_no_copy (MULT, V4HImode, operands);")
> +
>  (define_insn "*mmx_mulv4hi3"
> -  [(set (match_operand:V4HI 0 "register_operand" "=y")
> -(mult:V4HI (match_operand:V4HI 1 "nonimmediate_operand" "%0")
> -(match_operand:V4HI 2 "nonimmediate_operand" "ym")))]
> -  "TARGET_MMX && ix86_binary_operator_ok (MULT, V4HImode, operands)"
> -  "pmullw\t{%2, %0|%0, %2}"
> -  [(set_attr "type" "mmxmul")
> -   (set_attr "mode" "DI")])
> +  [(set (match_operand:V4HI 0 "register_operand" "=y,Yx,Yy")
> +(mult:V4HI (match_operand:V4HI 1 "nonimmediate_operand" "%0,0,Yy")
> +(match_operand:V4HI 2 "nonimmediate_operand" "ym,Yx,Yy")))]
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
> +   && ix86_binary_operator_ok (MULT, V4HImode, operands)"
> +  "@
> +   pmullw\t{%2, %0|%0, %2}
> +   pmullw\t{%2, %0|%0, %2}
> +   vpmullw\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "mmxmul,ssemul,ssemul")
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_expand "mmx_smulv4hi3_highpart"
>[(set (match_operand:V4HI 0 "register_operand")
> --
> 2.20.1
>
>


Re: [PATCH 04/43] i386: Emulate MMX plusminus/sat_plusminus with SSE

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX plusminus/sat_plusminus with SSE.  Only SSE register source
> operand is allowed.
>
>   PR target/89021
>   * config/i386/mmx.md (3): New.
>   (*mmx_3): Changed to define_insn_and_split
>   to support SSE emulation.
>   (*mmx_3): Likewise.
>   (mmx_3): Also allow TARGET_MMX_WITH_SSE.
> ---
>  gcc/config/i386/mmx.md | 49 +-
>  1 file changed, 34 insertions(+), 15 deletions(-)
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index ff9c5dc8507..32920343fcf 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -698,34 +698,53 @@
>"TARGET_MMX || (TARGET_SSE2 && mode == V1DImode)"
>"ix86_fixup_binary_operands_no_copy (, mode, operands);")
>
> +(define_expand "3"
> +  [(set (match_operand:MMXMODEI 0 "register_operand")
> + (plusminus:MMXMODEI
> +   (match_operand:MMXMODEI 1 "nonimmediate_operand")
> +   (match_operand:MMXMODEI 2 "nonimmediate_operand")))]
> +  "TARGET_MMX_WITH_SSE"
> +  "ix86_fixup_binary_operands_no_copy (, mode, operands);")
> +
>  (define_insn "*mmx_3"
> -  [(set (match_operand:MMXMODEI8 0 "register_operand" "=y")
> +  [(set (match_operand:MMXMODEI8 0 "register_operand" "=y,x,Yv")
>  (plusminus:MMXMODEI8
> -   (match_operand:MMXMODEI8 1 "nonimmediate_operand" "0")
> -   (match_operand:MMXMODEI8 2 "nonimmediate_operand" "ym")))]
> -  "(TARGET_MMX || (TARGET_SSE2 && mode == V1DImode))
> +   (match_operand:MMXMODEI8 1 "nonimmediate_operand" "0,0,Yv")
> +   (match_operand:MMXMODEI8 2 "nonimmediate_operand" "ym,x,Yv")))]
> +  "(TARGET_MMX
> +|| TARGET_MMX_WITH_SSE
> +|| (TARGET_SSE2 && mode == V1DImode))

Please change MMXMODEI8 iterator to:

> +(define_mode_iterator MMXMODEI8 [V8QI V4HI V2SI (V1DI "TARGET_SSE2")])

as was done in the previous version and use only

(TARGET_MMX || TARGET_MMX_WITH_SSE) && ...

here.

> && ix86_binary_operator_ok (, mode, operands)"
> -  "p\t{%2, %0|%0, %2}"
> -  [(set_attr "type" "mmxadd")
> -   (set_attr "mode" "DI")])
> +  "@
> +   p\t{%2, %0|%0, %2}
> +   p\t{%2, %0|%0, %2}
> +   vp\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "mmxadd,sseadd,sseadd")
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_expand "mmx_3"
>[(set (match_operand:MMXMODE12 0 "register_operand")
>   (sat_plusminus:MMXMODE12
> (match_operand:MMXMODE12 1 "nonimmediate_operand")
> (match_operand:MMXMODE12 2 "nonimmediate_operand")))]
> -  "TARGET_MMX"
> +  "TARGET_MMX || TARGET_MMX_WITH_SSE"
>"ix86_fixup_binary_operands_no_copy (, mode, operands);")
>
>  (define_insn "*mmx_3"
> -  [(set (match_operand:MMXMODE12 0 "register_operand" "=y")
> +  [(set (match_operand:MMXMODE12 0 "register_operand" "=y,x,Yv")
>  (sat_plusminus:MMXMODE12
> -   (match_operand:MMXMODE12 1 "nonimmediate_operand" "0")
> -   (match_operand:MMXMODE12 2 "nonimmediate_operand" "ym")))]
> -  "TARGET_MMX && ix86_binary_operator_ok (, mode, operands)"
> -  "p\t{%2, %0|%0, %2}"
> -  [(set_attr "type" "mmxadd")
> -   (set_attr "mode" "DI")])
> +   (match_operand:MMXMODE12 1 "nonimmediate_operand" "0,0,Yv")
> +   (match_operand:MMXMODE12 2 "nonimmediate_operand" "ym,x,Yv")))]
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
> +   && ix86_binary_operator_ok (, mode, operands)"
> +  "@
> +   p\t{%2, %0|%0, %2}
> +   p\t{%2, %0|%0, %2}
> +   vp\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "mmxadd,sseadd,sseadd")
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_expand "mmx_mulv4hi3"
>[(set (match_operand:V4HI 0 "register_operand")
> --
> 2.20.1
>
>


Re: [PATCH 03/43] i386: Emulate MMX punpcklXX/punpckhXX with SSE punpcklXX

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX punpcklXX/punpckhXX with SSE punpcklXX.  For MMX punpckhXX,
> move bits 64:127 to bits 0:63 in SSE register.  Only SSE register source
> operand is allowed.
>
>   PR target/89021
>   * config/i386/i386-protos.h (ix86_split_mmx_punpck): New
>   prototype.
>   * config/i386/i386.c (ix86_split_mmx_punpck): New function.
>   * config/i386/mmx.m (mmx_punpckhbw): Changed to
>   define_insn_and_split to support SSE emulation.
>   (mmx_punpcklbw): Likewise.
>   (mmx_punpckhwd): Likewise.
>   (mmx_punpcklwd): Likewise.
>   (mmx_punpckhdq): Likewise.
>   (mmx_punpckldq): Likewise.

Please fix split condition (as in the previous patch) and add missing DONEs.

Uros.

> ---
>  gcc/config/i386/i386-protos.h |   1 +
>  gcc/config/i386/i386.c|  77 +++
>  gcc/config/i386/mmx.md| 138 ++
>  3 files changed, 168 insertions(+), 48 deletions(-)
>
> diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
> index bb96a420a85..dc7fc38d8e4 100644
> --- a/gcc/config/i386/i386-protos.h
> +++ b/gcc/config/i386/i386-protos.h
> @@ -202,6 +202,7 @@ extern rtx ix86_split_stack_guard (void);
>
>  extern void ix86_move_vector_high_sse_to_mmx (rtx);
>  extern void ix86_split_mmx_pack (rtx[], enum rtx_code);
> +extern void ix86_split_mmx_punpck (rtx[], bool);
>
>  #ifdef TREE_CODE
>  extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree,
> int);
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 2af7f891350..cf7a71bcc02 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -20009,6 +20009,83 @@ ix86_split_mmx_pack (rtx operands[], enum rtx_code
> code)
>ix86_move_vector_high_sse_to_mmx (op0);
>  }
>
> +/* Split MMX punpcklXX/punpckhXX with SSE punpcklXX.  */
> +
> +void
> +ix86_split_mmx_punpck (rtx operands[], bool high_p)
> +{
> +  rtx op0 = operands[0];
> +  rtx op1 = operands[1];
> +  rtx op2 = operands[2];
> +  machine_mode mode = GET_MODE (op0);
> +  rtx mask;
> +  /* The corresponding SSE mode.  */
> +  machine_mode sse_mode, double_sse_mode;
> +
> +  switch (mode)
> +{
> +case E_V8QImode:
> +  sse_mode = V16QImode;
> +  double_sse_mode = V32QImode;
> +  mask = gen_rtx_PARALLEL (VOIDmode,
> +gen_rtvec (16,
> +   GEN_INT (0), GEN_INT (16),
> +   GEN_INT (1), GEN_INT (17),
> +   GEN_INT (2), GEN_INT (18),
> +   GEN_INT (3), GEN_INT (19),
> +   GEN_INT (4), GEN_INT (20),
> +   GEN_INT (5), GEN_INT (21),
> +   GEN_INT (6), GEN_INT (22),
> +   GEN_INT (7), GEN_INT (23)));
> +  break;
> +
> +case E_V4HImode:
> +  sse_mode = V8HImode;
> +  double_sse_mode = V16HImode;
> +  mask = gen_rtx_PARALLEL (VOIDmode,
> +gen_rtvec (8,
> +   GEN_INT (0), GEN_INT (8),
> +   GEN_INT (1), GEN_INT (9),
> +   GEN_INT (2), GEN_INT (10),
> +   GEN_INT (3), GEN_INT (11)));
> +  break;
> +
> +case E_V2SImode:
> +  sse_mode = V4SImode;
> +  double_sse_mode = V8SImode;
> +  mask = gen_rtx_PARALLEL (VOIDmode,
> +gen_rtvec (4,
> +   GEN_INT (0), GEN_INT (4),
> +   GEN_INT (1), GEN_INT (5)));
> +  break;
> +
> +default:
> +  gcc_unreachable ();
> +}
> +
> +  /* Generate SSE punpcklXX.  */
> +  rtx dest = gen_rtx_REG (sse_mode, REGNO (op0));
> +  op1 = gen_rtx_REG (sse_mode, REGNO (op1));
> +  op2 = gen_rtx_REG (sse_mode, REGNO (op2));
> +
> +  op1 = gen_rtx_VEC_CONCAT (double_sse_mode, op1, op2);
> +  op2 = gen_rtx_VEC_SELECT (sse_mode, op1, mask);
> +  rtx insn = gen_rtx_SET (dest, op2);
> +  emit_insn (insn);
> +
> +  if (high_p)
> +{
> +  /* Move bits 64:127 to bits 0:63.  */
> +  mask = gen_rtx_PARALLEL (VOIDmode,
> +gen_rtvec (4, GEN_INT (2), GEN_INT (3),
> +   GEN_INT (0), GEN_INT (0)));
> +  dest = gen_rtx_REG (V4SImode, REGNO (dest));
> +  op1 = gen_rtx_VEC_SELECT (V4SImode, dest, mask);
> +  insn = gen_rtx_SET (dest, op1);
> +  emit_insn (insn);
> +}
> +}
> +
>  /* Helper function of ix86_fixup_binary_operands to canonicalize
> operand order.  Returns true if the operands should be swapped.  */
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index 10096f7cab7..ff9c5dc8507 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -1089,87 +1089,129 @@

Re: [PATCH 02/43] i386: Emulate MMX packsswb/packssdw/packuswb with SSE2

2019-02-10 Thread Uros Bizjak
On 2/10/19, Uros Bizjak  wrote:
> On 2/10/19, H.J. Lu  wrote:
>> Emulate MMX packsswb/packssdw/packuswb with SSE
>> packsswb/packssdw/packuswb
>> plus moving bits 64:95 to bits 32:63 in SSE register.  Only SSE register
>> source operand is allowed.
>>
>> 2019-02-08  H.J. Lu  
>>  Uros Bizjak  
>>
>>  PR target/89021
>>  * config/i386/i386-protos.h (ix86_move_vector_high_sse_to_mmx):
>>  New prototype.
>>  (ix86_split_mmx_pack): Likewise.
>>  * config/i386/i386.c (ix86_move_vector_high_sse_to_mmx): New
>>  function.
>>  (ix86_split_mmx_pack): Likewise.
>>  * config/i386/i386.md (mmx_isa): New.
>>  (enabled): Also check mmx_isa.
>>  * config/i386/mmx.md (any_s_truncate): New code iterator.
>>  (s_trunsuffix): New code attr.
>>  (mmx_packsswb): Removed.
>>  (mmx_packssdw): Likewise.
>>  (mmx_packuswb): Likewise.
>>  (mmx_packswb): New define_insn_and_split to emulate
>>  MMX packsswb/packuswb with SSE2.
>>  (mmx_packssdw): Likewise.
>
> LGTM, with a couple of nits below.

Oh, you also need DONE; at the end of preparation statements,
otherwise splitters will inject (const_int 0) into the insn stream.

Uros.

>> ---
>>  gcc/config/i386/i386-protos.h |  3 ++
>>  gcc/config/i386/i386.c| 54 
>>  gcc/config/i386/i386.md   | 12 +++
>>  gcc/config/i386/mmx.md| 67 +++
>>  4 files changed, 106 insertions(+), 30 deletions(-)
>>
>> diff --git a/gcc/config/i386/i386-protos.h
>> b/gcc/config/i386/i386-protos.h
>> index 2d600173917..bb96a420a85 100644
>> --- a/gcc/config/i386/i386-protos.h
>> +++ b/gcc/config/i386/i386-protos.h
>> @@ -200,6 +200,9 @@ extern void ix86_expand_vecop_qihi (enum rtx_code,
>> rtx,
>> rtx, rtx);
>>
>>  extern rtx ix86_split_stack_guard (void);
>>
>> +extern void ix86_move_vector_high_sse_to_mmx (rtx);
>> +extern void ix86_split_mmx_pack (rtx[], enum rtx_code);
>> +
>>  #ifdef TREE_CODE
>>  extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree,
>> int);
>>  #endif  /* TREE_CODE  */
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index ba02c26c8b2..2af7f891350 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -19955,6 +19955,60 @@ ix86_expand_vector_move_misalign (machine_mode
>> mode, rtx operands[])
>>  gcc_unreachable ();
>>  }
>>
>> +/* Move bits 64:95 to bits 32:63.  */
>> +
>> +void
>> +ix86_move_vector_high_sse_to_mmx (rtx op)
>> +{
>> +  rtx mask = gen_rtx_PARALLEL (VOIDmode,
>> +   gen_rtvec (4, GEN_INT (0), GEN_INT (2),
>> +  GEN_INT (0), GEN_INT (0)));
>> +  rtx dest = gen_rtx_REG (V4SImode, REGNO (op));
>> +  op = gen_rtx_VEC_SELECT (V4SImode, dest, mask);
>> +  rtx insn = gen_rtx_SET (dest, op);
>> +  emit_insn (insn);
>> +}
>> +
>> +/* Split MMX pack with signed/unsigned saturation with SSE/SSE2.  */
>> +
>> +void
>> +ix86_split_mmx_pack (rtx operands[], enum rtx_code code)
>> +{
>> +  rtx op0 = operands[0];
>> +  rtx op1 = operands[1];
>> +  rtx op2 = operands[2];
>> +
>> +  machine_mode dmode = GET_MODE (op0);
>> +  machine_mode smode = GET_MODE (op1);
>> +  machine_mode inner_dmode = GET_MODE_INNER (dmode);
>> +  machine_mode inner_smode = GET_MODE_INNER (smode);
>> +
>> +  /* Get the corresponding SSE mode for destination.  */
>> +  int nunits = 16 / GET_MODE_SIZE (inner_dmode);
>> +  machine_mode sse_dmode = mode_for_vector (GET_MODE_INNER (dmode),
>> +nunits).require ();
>> +  machine_mode sse_half_dmode = mode_for_vector (GET_MODE_INNER (dmode),
>> + nunits / 2).require ();
>> +
>> +  /* Get the corresponding SSE mode for source.  */
>> +  nunits = 16 / GET_MODE_SIZE (inner_smode);
>> +  machine_mode sse_smode = mode_for_vector (GET_MODE_INNER (smode),
>> +nunits).require ();
>> +
>> +  /* Generate SSE pack with signed/unsigned saturation.  */
>> +  rtx dest = gen_rtx_REG (sse_dmode, REGNO (op0));
>> +  op1 = gen_rtx_REG (sse_smode, REGNO (op1));
>> +  op2 = gen_rtx_REG (sse_smode, REGNO (op2));
>> +
>> +  op1 = gen_rtx_fmt_e (code, sse_half_dmode, op1);
>> +  op2 = gen_rtx_fmt_e (code, sse_half_dmode, op2);
>> +  rtx insn = gen_rtx_SET (dest, gen_rtx_VEC_CONCAT (sse_dmode,
>> +op1, op2));
>> +  emit_insn (insn);
>> +
>> +  ix86_move_vector_high_sse_to_mmx (op0);
>> +}
>> +
>>  /* Helper function of ix86_fixup_binary_operands to canonicalize
>> operand order.  Returns true if the operands should be swapped.  */
>>
>> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
>> index 4a32144a71a..72685107fc0 100644
>> --- a/gcc/config/i386/i386.md
>> +++ b/gcc/config/i386/i386.md
>> @@ -792,6 +792,9 @@
>>  avx512vl,noavx512vl,x64_avx512dq,x64_avx512bw"
>>(const_string "base"))
>>
>> +;; Define 

Re: [PATCH 02/43] i386: Emulate MMX packsswb/packssdw/packuswb with SSE2

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> Emulate MMX packsswb/packssdw/packuswb with SSE packsswb/packssdw/packuswb
> plus moving bits 64:95 to bits 32:63 in SSE register.  Only SSE register
> source operand is allowed.
>
> 2019-02-08  H.J. Lu  
>   Uros Bizjak  
>
>   PR target/89021
>   * config/i386/i386-protos.h (ix86_move_vector_high_sse_to_mmx):
>   New prototype.
>   (ix86_split_mmx_pack): Likewise.
>   * config/i386/i386.c (ix86_move_vector_high_sse_to_mmx): New
>   function.
>   (ix86_split_mmx_pack): Likewise.
>   * config/i386/i386.md (mmx_isa): New.
>   (enabled): Also check mmx_isa.
>   * config/i386/mmx.md (any_s_truncate): New code iterator.
>   (s_trunsuffix): New code attr.
>   (mmx_packsswb): Removed.
>   (mmx_packssdw): Likewise.
>   (mmx_packuswb): Likewise.
>   (mmx_packswb): New define_insn_and_split to emulate
>   MMX packsswb/packuswb with SSE2.
>   (mmx_packssdw): Likewise.

LGTM, with a couple of nits below.

> ---
>  gcc/config/i386/i386-protos.h |  3 ++
>  gcc/config/i386/i386.c| 54 
>  gcc/config/i386/i386.md   | 12 +++
>  gcc/config/i386/mmx.md| 67 +++
>  4 files changed, 106 insertions(+), 30 deletions(-)
>
> diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
> index 2d600173917..bb96a420a85 100644
> --- a/gcc/config/i386/i386-protos.h
> +++ b/gcc/config/i386/i386-protos.h
> @@ -200,6 +200,9 @@ extern void ix86_expand_vecop_qihi (enum rtx_code, rtx,
> rtx, rtx);
>
>  extern rtx ix86_split_stack_guard (void);
>
> +extern void ix86_move_vector_high_sse_to_mmx (rtx);
> +extern void ix86_split_mmx_pack (rtx[], enum rtx_code);
> +
>  #ifdef TREE_CODE
>  extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree,
> int);
>  #endif   /* TREE_CODE  */
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index ba02c26c8b2..2af7f891350 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -19955,6 +19955,60 @@ ix86_expand_vector_move_misalign (machine_mode
> mode, rtx operands[])
>  gcc_unreachable ();
>  }
>
> +/* Move bits 64:95 to bits 32:63.  */
> +
> +void
> +ix86_move_vector_high_sse_to_mmx (rtx op)
> +{
> +  rtx mask = gen_rtx_PARALLEL (VOIDmode,
> +gen_rtvec (4, GEN_INT (0), GEN_INT (2),
> +   GEN_INT (0), GEN_INT (0)));
> +  rtx dest = gen_rtx_REG (V4SImode, REGNO (op));
> +  op = gen_rtx_VEC_SELECT (V4SImode, dest, mask);
> +  rtx insn = gen_rtx_SET (dest, op);
> +  emit_insn (insn);
> +}
> +
> +/* Split MMX pack with signed/unsigned saturation with SSE/SSE2.  */
> +
> +void
> +ix86_split_mmx_pack (rtx operands[], enum rtx_code code)
> +{
> +  rtx op0 = operands[0];
> +  rtx op1 = operands[1];
> +  rtx op2 = operands[2];
> +
> +  machine_mode dmode = GET_MODE (op0);
> +  machine_mode smode = GET_MODE (op1);
> +  machine_mode inner_dmode = GET_MODE_INNER (dmode);
> +  machine_mode inner_smode = GET_MODE_INNER (smode);
> +
> +  /* Get the corresponding SSE mode for destination.  */
> +  int nunits = 16 / GET_MODE_SIZE (inner_dmode);
> +  machine_mode sse_dmode = mode_for_vector (GET_MODE_INNER (dmode),
> + nunits).require ();
> +  machine_mode sse_half_dmode = mode_for_vector (GET_MODE_INNER (dmode),
> +  nunits / 2).require ();
> +
> +  /* Get the corresponding SSE mode for source.  */
> +  nunits = 16 / GET_MODE_SIZE (inner_smode);
> +  machine_mode sse_smode = mode_for_vector (GET_MODE_INNER (smode),
> + nunits).require ();
> +
> +  /* Generate SSE pack with signed/unsigned saturation.  */
> +  rtx dest = gen_rtx_REG (sse_dmode, REGNO (op0));
> +  op1 = gen_rtx_REG (sse_smode, REGNO (op1));
> +  op2 = gen_rtx_REG (sse_smode, REGNO (op2));
> +
> +  op1 = gen_rtx_fmt_e (code, sse_half_dmode, op1);
> +  op2 = gen_rtx_fmt_e (code, sse_half_dmode, op2);
> +  rtx insn = gen_rtx_SET (dest, gen_rtx_VEC_CONCAT (sse_dmode,
> + op1, op2));
> +  emit_insn (insn);
> +
> +  ix86_move_vector_high_sse_to_mmx (op0);
> +}
> +
>  /* Helper function of ix86_fixup_binary_operands to canonicalize
> operand order.  Returns true if the operands should be swapped.  */
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index 4a32144a71a..72685107fc0 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -792,6 +792,9 @@
>   avx512vl,noavx512vl,x64_avx512dq,x64_avx512bw"
>(const_string "base"))
>
> +;; Define instruction set of MMX instructions
> +(define_attr "mmx_isa" "base,native,x64,x64_noavx,x64_avx" (const_string
> "base"))
> +
>  (define_attr "enabled" ""
>(cond [(eq_attr "isa" "x64") (symbol_ref "TARGET_64BIT")
>(eq_attr "isa" "x64_sse2")
> @@ -830,6 +833,15 @@
>   

Re: [Patch] [arm] Fix 88714, Arm LDRD/STRD peepholes

2019-02-10 Thread Christophe Lyon
On Fri, 8 Feb 2019 at 12:40, Jakub Jelinek  wrote:
>
> On Fri, Feb 08, 2019 at 11:29:10AM +, Matthew Malcomson wrote:
> > I'm pretty sure there's no difference between the iwmmxt target and
> > others so believe your simpler fix of just using 'q' is a good idea.
> > (there's no difference in gas and no documentation I have found mentions
> > a difference).
>
> The simpler patch would be then (but of course in that case the question is
> why iwmmxt.md doesn't use those q constraints for the output_move_double
> alternatives).
>
> 2019-02-08  Jakub Jelinek  
>
> PR bootstrap/88714
> * config/arm/ldrdstrd.md (*arm_ldrd, *arm_strd): Use q constraint
> instead of r.
>

Both this simple patch or the previous fix all the ICEs I reported, thanks.

Of course, the scan-assembler failures remain to be fixed.

> --- gcc/config/arm/ldrdstrd.md.jj   2019-02-08 11:25:42.368916124 +0100
> +++ gcc/config/arm/ldrdstrd.md  2019-02-08 12:38:33.647585108 +0100
> @@ -157,9 +157,9 @@ (define_peephole2 ; swap the destination
>  ;; We use gen_operands_ldrd_strd() with a modify argument as false so that 
> the
>  ;; operands are not changed.
>  (define_insn "*arm_ldrd"
> -  [(parallel [(set (match_operand:SI 0 "s_register_operand" "=r")
> +  [(parallel [(set (match_operand:SI 0 "s_register_operand" "=q")
>(match_operand:SI 2 "memory_operand" "m"))
> - (set (match_operand:SI 1 "s_register_operand" "=r")
> + (set (match_operand:SI 1 "s_register_operand" "=q")
>(match_operand:SI 3 "memory_operand" "m"))])]
>"TARGET_LDRD && TARGET_ARM && reload_completed
>&& valid_operands_ldrd_strd (operands, true)"
> @@ -178,9 +178,9 @@ (define_insn "*arm_ldrd"
>
>  (define_insn "*arm_strd"
>[(parallel [(set (match_operand:SI 2 "memory_operand" "=m")
> -  (match_operand:SI 0 "s_register_operand" "r"))
> +  (match_operand:SI 0 "s_register_operand" "q"))
>   (set (match_operand:SI 3 "memory_operand" "=m")
> -  (match_operand:SI 1 "s_register_operand" "r"))])]
> +  (match_operand:SI 1 "s_register_operand" "q"))])]
>"TARGET_LDRD && TARGET_ARM && reload_completed
>&& valid_operands_ldrd_strd (operands, false)"
>{
>
>
> Jakub


Re: [PATCH 01/43] i386: Allow 64-bit vector modes in SSE registers

2019-02-10 Thread Uros Bizjak
On 2/10/19, H.J. Lu  wrote:
> In 64-bit mode, SSE2 can be used to emulate MMX instructions without
> 3DNOW.  We can use SSE2 to support 64-bit vectors.
>
>   PR target/89021
>   * config/i386/i386.c (ix86_set_reg_reg_cost): Also support
>   VALID_MMX_WITH_SSE_REG_MODE.
>   (ix86_vector_mode_supported_p): Likewise.
>   * config/i386/i386.h (TARGET_MMX_WITH_SSE): New.
>   (TARGET_MMX_WITH_SSE_P): Likewise.
>   (VALID_MMX_WITH_SSE_REG_MODE): Likewise.
> ---
>  gcc/config/i386/i386.c |  3 +++
>  gcc/config/i386/i386.h | 14 ++
>  2 files changed, 17 insertions(+)
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 12bc7926f86..ba02c26c8b2 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -40235,6 +40235,7 @@ ix86_set_reg_reg_cost (machine_mode mode)
> || (TARGET_AVX && VALID_AVX256_REG_MODE (mode))
> || (TARGET_SSE2 && VALID_SSE2_REG_MODE (mode))
> || (TARGET_SSE && VALID_SSE_REG_MODE (mode))
> +   || (TARGET_MMX_WITH_SSE && VALID_MMX_WITH_SSE_REG_MODE (mode))
> || (TARGET_MMX && VALID_MMX_REG_MODE (mode)))

With V2SFmode out of the way (see below) we can finaly use

(TARGET_MMX || TARGET_MMX_WITH_SSE) && VALID_MMX_REG_MODE (mode).

This is a cost function, and we do have DImode and SImode in SSE registers.

>   units = GET_MODE_SIZE (mode);
>  }
> @@ -44057,6 +44058,8 @@ ix86_vector_mode_supported_p (machine_mode mode)
>  return true;
>if (TARGET_SSE2 && VALID_SSE2_REG_MODE (mode))
>  return true;
> +  if (TARGET_MMX_WITH_SSE && VALID_MMX_WITH_SSE_REG_MODE (mode))
> +return true;

Assuming middle end won't ask for scalar modes, and with V2SFmode out
of the way, we can also use

(TARGET_MMX || TARGET_MMX_WITH_SSE) && VALID_MMX_REG_MODE (mode)

here.

>if (TARGET_AVX && VALID_AVX256_REG_MODE (mode))
>  return true;
>if (TARGET_AVX512F && VALID_AVX512F_REG_MODE (mode))
> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> index 83b025e0cf5..3ae0900caa0 100644
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -201,6 +201,13 @@ see the files COPYING3 and COPYING.RUNTIME
> respectively.  If not, see
>  #define TARGET_16BIT TARGET_CODE16
>  #define TARGET_16BIT_P(x)TARGET_CODE16_P(x)
>
> +/* In 64-bit mode, SSE2 can be used to emulate MMX instructions.
> +   FIXME: All 3DNOW patterns needs to be updated with SSE emulation.  */
> +#define TARGET_MMX_WITH_SSE \
> +  (TARGET_64BIT && TARGET_SSE2 && !TARGET_3DNOW)
> +#define TARGET_MMX_WITH_SSE_P(x) \
> +  (TARGET_64BIT_P (x) && TARGET_SSE2_P (x) && !TARGET_3DNOW_P (x))

The above is not acceptable, the choice of native MMX should not
depend on -m3dnow flag. So, instead of FIXME, please leave partial
conversion of V2SF mode out of the patchset, V2SF values should still
live in MMX registers.

Actually, -m3dnow is a dead end, deprecated insn set, so I see no
reason to emulate V2SF at all. SSE doesn't have native V2SF
instructions, and emulating reciprocals will trap due to 0.0 in the
high two elements. There are also hard to emulate reciprocal step
instructions.

Also, the purpose of the patchset is to convert MMX builtins, since
SSE builtins depend on them, so at the end we can avoid enabling MMX
registers with -msse, and thus making -mmmx orthogonal to -msse. We
don't wan to sneak in an autovectorization of V2SF with the patchset.

>  #include "config/vxworks-dummy.h"
>
>  #include "config/i386/i386-opts.h"
> @@ -1143,6 +1150,13 @@ extern const char *host_detect_local_cpu (int argc,
> const char **argv);
> || (MODE) == V4SImode || (MODE) == V4SFmode || (MODE) == V8HImode \
> || (MODE) == TFmode || (MODE) == V1TImode)
>
> +/* NB: Don't use VALID_MMX_REG_MODE with TARGET_MMX_WITH_SSE since we
> +   want to include only 8-byte vector modes, like V2SFmode, but not

No, we don't want to include V2SF mode vectors.

> +   DImode nor SImode.  */
> +#define VALID_MMX_WITH_SSE_REG_MODE(MODE)\
> +  ((MODE) == V1DImode || (MODE) == V8QImode || (MODE) == V4HImode\
> +   || (MODE) == V2SImode || (MODE) == V2SFmode)

Without V2SFmode, the above definition is unneeded.

Uros.

>  #define VALID_SSE2_REG_MODE(MODE)\
>((MODE) == V16QImode || (MODE) == V8HImode || (MODE) == V2DFmode   \
> || (MODE) == V2DImode || (MODE) == DFmode)
> --
> 2.20.1
>
>


[PATCH, PR d/88989] Committed fix for ICE on recursive field initializers

2019-02-10 Thread Iain Buclaw
Hi,

This patch merges the D front-end implementation with dmd upstream
39edbe17e.  Only includes a backport from a latter version, fixing PR
d/88989.

Boostrapped and regression tested on x86_64-linux-gnu.

Committed to trunk as r268740.

-- 
Iain
---
diff --git a/gcc/d/dmd/MERGE b/gcc/d/dmd/MERGE
index c1c6cc145c4..8b377015129 100644
--- a/gcc/d/dmd/MERGE
+++ b/gcc/d/dmd/MERGE
@@ -1,4 +1,4 @@
-e21c07e84bd9668e1c0fc1f45e514c5fd76988e7
+39edbe17e7b5c761d780c9d1d4376a06df7bf3d8
 
 The first line of this file holds the git revision number of the last
 merge done from the dlang/dmd repository.
diff --git a/gcc/d/dmd/dstruct.c b/gcc/d/dmd/dstruct.c
index b44d63298e6..d35b005a47d 100644
--- a/gcc/d/dmd/dstruct.c
+++ b/gcc/d/dmd/dstruct.c
@@ -723,7 +723,14 @@ bool AggregateDeclaration::fill(Loc loc, Expressions *elements, bool ctorinit)
 else if (vx->_init)
 {
 assert(!vx->_init->isVoidInitializer());
-e = vx->getConstInitializer(false);
+if (vx->inuse)   // https://issues.dlang.org/show_bug.cgi?id=18057
+{
+vx->error(loc, "recursive initialization of field");
+errors = true;
+e = NULL;
+}
+else
+e = vx->getConstInitializer(false);
 }
 else
 {
diff --git a/gcc/testsuite/gdc.test/compilable/interpret3.d b/gcc/testsuite/gdc.test/compilable/interpret3.d
index 8e7025c7f59..386743e6ddc 100644
--- a/gcc/testsuite/gdc.test/compilable/interpret3.d
+++ b/gcc/testsuite/gdc.test/compilable/interpret3.d
@@ -7731,3 +7731,14 @@ bool foo17407()
 
 static assert(!foo17407);
 
+/**/
+// https://issues.dlang.org/show_bug.cgi?id=18057
+// Recursive field initializer causes segfault.
+
+struct RBNode(T)
+{
+RBNode!T *copy = new RBNode!T;
+}
+
+static assert(!__traits(compiles, { alias bug18057 = RBNode!int; }));
+
diff --git a/gcc/testsuite/gdc.test/fail_compilation/fail18057.d b/gcc/testsuite/gdc.test/fail_compilation/fail18057.d
new file mode 100644
index 000..5e2bab7f796
--- /dev/null
+++ b/gcc/testsuite/gdc.test/fail_compilation/fail18057.d
@@ -0,0 +1,16 @@
+/**
+TEST_OUTPUT:
+---
+fail_compilation/fail18057.d(16): Error: template instance RBNode!int `RBNode` is not a template declaration, it is a struct
+fail_compilation/fail18057.d(13): Error: variable fail18057.RBNode.copy recursive initialization of field
+---
+*/
+
+// https://issues.dlang.org/show_bug.cgi?id=18057
+// Recursive field initializer causes segfault.
+struct RBNode
+{
+RBNode *copy = new RBNode;
+}
+
+alias bug18057 = RBNode!int;
diff --git a/gcc/testsuite/gdc.test/fail_compilation/fail18057b.d b/gcc/testsuite/gdc.test/fail_compilation/fail18057b.d
new file mode 100644
index 000..14abbfd346f
--- /dev/null
+++ b/gcc/testsuite/gdc.test/fail_compilation/fail18057b.d
@@ -0,0 +1,13 @@
+/**
+TEST_OUTPUT:
+---
+fail_compilation/fail18057b.d(12): Error: variable `fail18057b.Recursive.field` recursive initialization of field
+---
+*/
+
+// https://issues.dlang.org/show_bug.cgi?id=18057
+// Recursive field initializer causes segfault.
+struct Recursive
+{
+int field = Recursive();
+}