Re: [PATCH] vect: Use factored nloads for load cost modeling [PR82255]

2021-01-18 Thread Kewen.Lin via Gcc-patches
Hi Richard,

on 2021/1/15 下午8:03, Richard Biener wrote:
> On Fri, Jan 15, 2021 at 9:11 AM Kewen.Lin  wrote:
>>
>> Hi,
>>
>> This patch follows Richard's suggestion in the thread discussion[1],
>> it's to factor out the nloads computation in vectorizable_load for
>> strided access, to ensure we can obtain the consistent information
>> when estimating the costs.
>>
>> btw, the reason why I didn't try to save the information into
>> stmt_info during analysis phase and then fetch it in transform phase
>> is that the information is just for strided slp loading, and to
>> re-compute it looks not very expensive and acceptable.
>>
>> Bootstrapped/regtested on powerpc64le-linux-gnu P9,
>> x86_64-redhat-linux and aarch64-linux-gnu.
>>
>> Is it ok for trunk?  Or it belongs to next stage 1?
> 
> First of all I think this is stage1 material now.  As we now do
> SLP costing from vectorizable_* as well I would prefer to have
> vectorizable_* be structured so that costing is done next to
> the transform.  Thus rather than finish the vectorizable_*
> function when !vec_stmt go along further but in places where
> code is generated depending on !vec_stmt perform costing.
> This makes it easier to keep consting and transform in sync
> and match up.  It might not be necessary for the simple
> vectorizable_ functions but for loads and stores there are
> so many paths through code generation that matching it up
> with vect_model_{load/store}_cost is almost impossible.
> 

Thanks for the comments!  Your new suggestion sounds better,
I'll update the patch according to this.


BR,
Kewen


Re: [PATCH] rs6000: Use rldimi for vec init instead of shift + ior

2021-01-18 Thread Kewen.Lin via Gcc-patches
on 2021/1/15 下午2:40, Kewen.Lin via Gcc-patches wrote:
> Hi Segher,
> 
> on 2021/1/15 上午8:50, Segher Boessenkool wrote:
>> Hi!
>>
>> On Tue, Dec 22, 2020 at 04:08:26PM +0800, Kewen.Lin wrote:
>>> This patch is to make unsigned int vector init go with
>>> rldimi to merge two integers instead of shift and ior.
>>>
>>> I tried to use nonzero_bits in md file to make it more
>>> general, but the testing shows it isn't doable.  The
>>> reason is that some passes would replace some pseudos
>>> with other pseudos and do the recog again, but at that
>>> time the nonzero_bits could get rough information and
>>> lead the recog fails unexpectedly.
>>
>> Aha.  So nonzero_bits is unusable for most purposes as well.  Great :-/
>>
>>> btw, the test case would reply on the combine patch[1].
>>
>> Can you make a different testcase perhaps?  This patch looks fine
>> otherwise.
>>
> 
> Thanks!  But I'm sorry that there is a typo, it should be "rely on"
> rather than "reply on", without the patch[1] "combine: zeroing cost
> for new copies", this test case doesn't get the desirable result.
> I'll explain it more in that patch's thread.  If your uncommitted
> patch there with define_split and nonzero_bits works, this patch
> can be discarded totally.

After the testing and the commit log shows, I realized that that
patch can only work for three instruction combination, so this patch
is still needed.


BR,
Kewen


Re: [PATCH] combine: zeroing cost for new copies

2021-01-18 Thread Kewen.Lin via Gcc-patches
Hi Segher,

on 2021/1/15 下午2:49, Kewen.Lin via Gcc-patches wrote:
> on 2021/1/15 上午4:43, Segher Boessenkool wrote:

[snip...]

>> Long ago I had the following patch for this.  Not sure why I never
>> submitted it, maybe there is something wronmg with it?
>>
> 
> If you don't mind, I'll do a check with bootstrappping and regression
> testing and then get back to you.

Your posted patch was bootstrapped/regtested on powerpc64le-linux-gnu
Power9 and powerpc64-linux-gnu Power8, it looks fine to be landed.  :)

I also confirmed that the vec init with type int patch works as expected
on top of this.  Could you have a double check and commit it later
if you don't find anything wrong either?

Thanks in advance.

BR,
Kewen


Re: [committed] Skip asm goto test fails on hppa

2021-01-18 Thread Hans-Peter Nilsson
On Mon, 18 Jan 2021, John David Anglin wrote:
> The hppa target is a reload target and asm goto is not supported on reload 
> targets.
> Skip failing tests on hppa.

IIUC the preferred term is "IRA target" or maybe "non-LRA
target", as opposed to "LRA target".  The tests fail for
cris-elf too, another IRA target, so I'd like to use that term
when adjusting the dg-skip-if, hope you don't mind.

But also, I'd like to xfail it instead for cris-elf, which adds
a caveat: people might then think a "reload target" is not the
same as an "IRA target", what with the different adjustments.

brgds, H-P

>
> Committed to master.
>
> Regards,
> Dave
>
> Skip asm goto tests on hppa*-*-*.
>
> gcc/testsuite/ChangeLog:
>
>   PR testsuite/97987
>   * gcc.c-torture/compile/asmgoto-2.c: Skip on hppa.
>   * gcc.c-torture/compile/asmgoto-5.c: Likewise.
>
> diff --git a/gcc/testsuite/gcc.c-torture/compile/asmgoto-2.c 
> b/gcc/testsuite/gcc.c-torture/compile/asmgoto-2.c
> index f1b30c02884..d2d2ac536bd 100644
> --- a/gcc/testsuite/gcc.c-torture/compile/asmgoto-2.c
> +++ b/gcc/testsuite/gcc.c-torture/compile/asmgoto-2.c
> @@ -1,5 +1,6 @@
>  /* This test should be switched off for a new target with less than 4 
> allocatable registers */
>  /* { dg-do compile } */
> +/* { dg-skip-if "Reload target" { hppa*-*-* } } */
>  int
>  foo (void)
>  {
> diff --git a/gcc/testsuite/gcc.c-torture/compile/asmgoto-5.c 
> b/gcc/testsuite/gcc.c-torture/compile/asmgoto-5.c
> index 94c14dd4005..ce751ced90c 100644
> --- a/gcc/testsuite/gcc.c-torture/compile/asmgoto-5.c
> +++ b/gcc/testsuite/gcc.c-torture/compile/asmgoto-5.c
> @@ -1,6 +1,7 @@
>  /* Test to generate output reload in asm goto on x86_64.  */
>  /* { dg-do compile } */
>  /* { dg-skip-if "no O0" { { i?86-*-* x86_64-*-* } && { ! ia32 } } { "-O0" } 
> { "" } } */
> +/* { dg-skip-if "Reload target" { hppa*-*-* } } */
>
>  #if defined __x86_64__
>  #define ASM(s) asm (s)
>


Re: [PATCH,rs6000] Optimize pcrel access of globals [ping]

2021-01-18 Thread Aaron Sawdey via Gcc-patches
Ping.

Aaron Sawdey, Ph.D. saw...@linux.ibm.com
IBM Linux on POWER Toolchain
 

> On Dec 9, 2020, at 11:04 AM, acsaw...@linux.ibm.com wrote:
> 
> From: Aaron Sawdey 
> 
> Ping. I've folded in the changes to comments suggested by Will Schmidt.
> 
> This patch implements a RTL pass that looks for pc-relative loads of the
> address of an external variable using the PCREL_GOT relocation and a
> single load or store that uses that external address.
> 
> Produced by a cast of thousands:
> * Michael Meissner
> * Peter Bergner
> * Bill Schmidt
> * Alan Modra
> * Segher Boessenkool
> * Aaron Sawdey
> 
> Passes bootstrap/regtest on ppc64le power10. Should have no effect on
> other processors. OK for trunk?
> 
> Thanks!
>   Aaron
> 
> gcc/ChangeLog:
> 
>   * config.gcc: Add pcrel-opt.c and pcrel-opt.o.
>   * config/rs6000/pcrel-opt.c: New file.
>   * config/rs6000/pcrel-opt.md: New file.
>   * config/rs6000/predicates.md: Add d_form_memory predicate.
>   * config/rs6000/rs6000-cpus.def: Add OPTION_MASK_PCREL_OPT.
>   * config/rs6000/rs6000-passes.def: Add pass_pcrel_opt.
>   * config/rs6000/rs6000-protos.h: Add reg_to_non_prefixed(),
>   offsettable_non_prefixed_memory(), output_pcrel_opt_reloc(),
>   and make_pass_pcrel_opt().
>   * config/rs6000/rs6000.c (reg_to_non_prefixed): Make global.
>   (rs6000_option_override_internal): Add pcrel-opt.
>   (rs6000_delegitimize_address): Support pcrel-opt.
>   (rs6000_opt_masks): Add pcrel-opt.
>   (offsettable_non_prefixed_memory): New function.
>   (reg_to_non_prefixed): Make global.
>   (rs6000_asm_output_opcode): Reset next_insn_prefixed_p.
>   (output_pcrel_opt_reloc): New function.
>   * config/rs6000/rs6000.md (loads_extern_addr): New attr.
>   (pcrel_extern_addr): Set loads_extern_addr.
>   Add include for pcrel-opt.md.
>   * config/rs6000/rs6000.opt: Add -mpcrel-opt.
>   * config/rs6000/t-rs6000: Add rules for pcrel-opt.c and
>   pcrel-opt.md.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/powerpc/pcrel-opt-inc-di.c: New test.
>   * gcc.target/powerpc/pcrel-opt-ld-df.c: New test.
>   * gcc.target/powerpc/pcrel-opt-ld-di.c: New test.
>   * gcc.target/powerpc/pcrel-opt-ld-hi.c: New test.
>   * gcc.target/powerpc/pcrel-opt-ld-qi.c: New test.
>   * gcc.target/powerpc/pcrel-opt-ld-sf.c: New test.
>   * gcc.target/powerpc/pcrel-opt-ld-si.c: New test.
>   * gcc.target/powerpc/pcrel-opt-ld-vector.c: New test.
>   * gcc.target/powerpc/pcrel-opt-st-df.c: New test.
>   * gcc.target/powerpc/pcrel-opt-st-di.c: New test.
>   * gcc.target/powerpc/pcrel-opt-st-hi.c: New test.
>   * gcc.target/powerpc/pcrel-opt-st-qi.c: New test.
>   * gcc.target/powerpc/pcrel-opt-st-sf.c: New test.
>   * gcc.target/powerpc/pcrel-opt-st-si.c: New test.
>   * gcc.target/powerpc/pcrel-opt-st-vector.c: New test.
> ---
> gcc/config.gcc|   6 +-
> gcc/config/rs6000/pcrel-opt.c | 888 ++
> gcc/config/rs6000/pcrel-opt.md| 386 
> gcc/config/rs6000/predicates.md   |  23 +
> gcc/config/rs6000/rs6000-cpus.def |   2 +
> gcc/config/rs6000/rs6000-passes.def   |   8 +
> gcc/config/rs6000/rs6000-protos.h |   4 +
> gcc/config/rs6000/rs6000.c| 116 ++-
> gcc/config/rs6000/rs6000.md   |   8 +-
> gcc/config/rs6000/rs6000.opt  |   4 +
> gcc/config/rs6000/t-rs6000|   7 +-
> .../gcc.target/powerpc/pcrel-opt-inc-di.c |  18 +
> .../gcc.target/powerpc/pcrel-opt-ld-df.c  |  36 +
> .../gcc.target/powerpc/pcrel-opt-ld-di.c  |  43 +
> .../gcc.target/powerpc/pcrel-opt-ld-hi.c  |  42 +
> .../gcc.target/powerpc/pcrel-opt-ld-qi.c  |  42 +
> .../gcc.target/powerpc/pcrel-opt-ld-sf.c  |  42 +
> .../gcc.target/powerpc/pcrel-opt-ld-si.c  |  41 +
> .../gcc.target/powerpc/pcrel-opt-ld-vector.c  |  36 +
> .../gcc.target/powerpc/pcrel-opt-st-df.c  |  36 +
> .../gcc.target/powerpc/pcrel-opt-st-di.c  |  37 +
> .../gcc.target/powerpc/pcrel-opt-st-hi.c  |  42 +
> .../gcc.target/powerpc/pcrel-opt-st-qi.c  |  42 +
> .../gcc.target/powerpc/pcrel-opt-st-sf.c  |  36 +
> .../gcc.target/powerpc/pcrel-opt-st-si.c  |  41 +
> .../gcc.target/powerpc/pcrel-opt-st-vector.c  |  36 +
> 26 files changed, 2013 insertions(+), 9 deletions(-)
> create mode 100644 gcc/config/rs6000/pcrel-opt.c
> create mode 100644 gcc/config/rs6000/pcrel-opt.md
> create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-inc-di.c
> create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-df.c
> create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-di.c
> create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-hi.c
> create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-qi.c
> create mode 100644 

Re: [PATCH,rs6000] Test cases for p10 fusion patterns

2021-01-18 Thread Aaron Sawdey via Gcc-patches
Ping.

Aaron Sawdey, Ph.D. saw...@linux.ibm.com
IBM Linux on POWER Toolchain
 

> On Jan 3, 2021, at 2:44 PM, Aaron Sawdey  wrote:
> 
> Ping.
> 
> Aaron Sawdey, Ph.D. saw...@linux.ibm.com
> IBM Linux on POWER Toolchain
> 
> 
>> On Dec 11, 2020, at 1:53 PM, acsaw...@linux.ibm.com wrote:
>> 
>> From: Aaron Sawdey 
>> 
>> This adds some test cases to make sure that the combine patterns for p10
>> fusion are working.
>> 
>> These test cases pass on power10. OK for trunk after the 2 previous patches
>> for the fusion patterns go in?
>> 
>> Thanks!
>>  Aaron
>> 
>> gcc/testsuite/ChangeLog:
>>  * gcc.target/powerpc/fusion-p10-ldcmpi.c: New file.
>>  * gcc.target/powerpc/fusion-p10-2logical.c: New file.
>> ---
>> .../gcc.target/powerpc/fusion-p10-2logical.c  | 201 ++
>> .../gcc.target/powerpc/fusion-p10-ldcmpi.c|  66 ++
>> 2 files changed, 267 insertions(+)
>> create mode 100644 gcc/testsuite/gcc.target/powerpc/fusion-p10-2logical.c
>> create mode 100644 gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c
>> 
>> diff --git a/gcc/testsuite/gcc.target/powerpc/fusion-p10-2logical.c 
>> b/gcc/testsuite/gcc.target/powerpc/fusion-p10-2logical.c
>> new file mode 100644
>> index 000..cfe8f6c679a
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/fusion-p10-2logical.c
>> @@ -0,0 +1,201 @@
>> +/* { dg-do compile { target { powerpc*-*-* } } } */
>> +/* { dg-skip-if "" { powerpc*-*-darwin* } } */
>> +/* { dg-options "-mdejagnu-cpu=power10 -O3 -dp" } */
>> +
>> +#include 
>> +#include 
>> +
>> +/* and/andc/eqv/nand/nor/or/orc/xor */
>> +#define AND(a,b) ((a)&(b))
>> +#define ANDC1(a,b) ((a)&((~b)))
>> +#define ANDC2(a,b) ((~(a))&(b))
>> +#define EQV(a,b) (~((a)^(b)))
>> +#define NAND(a,b) (~((a)&(b)))
>> +#define NOR(a,b) (~((a)|(b)))
>> +#define OR(a,b) ((a)|(b))
>> +#define ORC1(a,b) ((a)|((~b)))
>> +#define ORC2(a,b) ((~(a))|(b))
>> +#define XOR(a,b) ((a)^(b))
>> +#define TEST1(type, func)   
>> \
>> +  type func ## _and_T_ ## type (type a, type b, type c) { return 
>> AND(func(a,b),c); } \
>> +  type func ## _andc1_T_   ## type (type a, type b, type c) { return 
>> ANDC1(func(a,b),c); } \
>> +  type func ## _andc2_T_   ## type (type a, type b, type c) { return 
>> ANDC2(func(a,b),c); } \
>> +  type func ## _eqv_T_ ## type (type a, type b, type c) { return 
>> EQV(func(a,b),c); } \
>> +  type func ## _nand_T_## type (type a, type b, type c) { return 
>> NAND(func(a,b),c); } \
>> +  type func ## _nor_T_ ## type (type a, type b, type c) { return 
>> NOR(func(a,b),c); } \
>> +  type func ## _or_T_  ## type (type a, type b, type c) { return 
>> OR(func(a,b),c); } \
>> +  type func ## _orc1_T_## type (type a, type b, type c) { return 
>> ORC1(func(a,b),c); } \
>> +  type func ## _orc2_T_## type (type a, type b, type c) { return 
>> ORC2(func(a,b),c); } \
>> +  type func ## _xor_T_ ## type (type a, type b, type c) { return 
>> XOR(func(a,b),c); } \
>> +  type func ## _rev_and_T_ ## type (type a, type b, type c) { return 
>> AND(c,func(a,b)); } \
>> +  type func ## _rev_andc1_T_   ## type (type a, type b, type c) { return 
>> ANDC1(c,func(a,b)); } \
>> +  type func ## _rev_andc2_T_   ## type (type a, type b, type c) { return 
>> ANDC2(c,func(a,b)); } \
>> +  type func ## _rev_eqv_T_ ## type (type a, type b, type c) { return 
>> EQV(c,func(a,b)); } \
>> +  type func ## _rev_nand_T_## type (type a, type b, type c) { return 
>> NAND(c,func(a,b)); } \
>> +  type func ## _rev_nor_T_ ## type (type a, type b, type c) { return 
>> NOR(c,func(a,b)); } \
>> +  type func ## _rev_or_T_  ## type (type a, type b, type c) { return 
>> OR(c,func(a,b)); } \
>> +  type func ## _rev_orc1_T_## type (type a, type b, type c) { return 
>> ORC1(c,func(a,b)); } \
>> +  type func ## _rev_orc2_T_## type (type a, type b, type c) { return 
>> ORC2(c,func(a,b)); } \
>> +  type func ## _rev_xor_T_ ## type (type a, type b, type c) { return 
>> XOR(c,func(a,b)); }
>> +#define TEST(type)\
>> +  TEST1(type,AND) \
>> +  TEST1(type,ANDC1)   \
>> +  TEST1(type,ANDC2)   \
>> +  TEST1(type,EQV) \
>> +  TEST1(type,NAND)\
>> +  TEST1(type,NOR) \
>> +  TEST1(type,OR)  \
>> +  TEST1(type,ORC1)\
>> +  TEST1(type,ORC2)\
>> +  TEST1(type,XOR)
>> +
>> +typedef vector bool char vboolchar_t;
>> +typedef vector unsigned int vuint_t;
>> +
>> +TEST(uint8_t);
>> +TEST(int8_t);
>> +TEST(uint16_t);
>> +TEST(int16_t);
>> +TEST(uint32_t);
>> +TEST(int32_t);
>> +TEST(uint64_t);
>> +TEST(int64_t);
>> +TEST(vboolchar_t);
>> +TEST(vuint_t);
>> +  
>> +/* { dg-final { scan-assembler-times "fuse_and_and/0"16 } } */
>> +/* { dg-final { scan-assembler-times "fuse_and_and/2"16 } } */
>> +/* { dg-final { scan-assembler-times "fuse_andc_and/0"   48 } } */
>> +/* { dg-final { scan-assembler-times "fuse_andc_and/1"   16 } } */
>> +/* { dg-final { scan-assembler-times "fuse_andc_and/2"   

Re: [PATCH,rs6000] Fusion patterns for logical-logical

2021-01-18 Thread Aaron Sawdey via Gcc-patches
Ping.

Aaron Sawdey, Ph.D. saw...@linux.ibm.com
IBM Linux on POWER Toolchain
 

> On Jan 3, 2021, at 2:43 PM, Aaron Sawdey  wrote:
> 
> Ping.
> 
> Aaron Sawdey, Ph.D. saw...@linux.ibm.com
> IBM Linux on POWER Toolchain
> 
> 
>> On Dec 10, 2020, at 8:41 PM, acsaw...@linux.ibm.com wrote:
>> 
>> From: Aaron Sawdey 
>> 
>> This patch adds a new function to genfusion.pl to generate patterns for
>> logical-logical fusion. They are enabled by default for power10 and can
>> be disabled by -mno-power10-fusion-2logical or -mno-power10-fusion.
>> 
>> This patch builds on top of the load-cmpi patch posted earlier this week.
>> 
>> Bootstrap passed on ppc64le/power10, if regtests pass, ok for trunk?
>> 
>> gcc/ChangeLog
>>  * config/rs6000/genfusion.pl (gen_2logical): New function to
>>  generate patterns for logical-logical fusion.
>>  * config/rs6000/fusion.md: Regenerated patterns.
>>  * config/rs6000/rs6000-cpus.def: Add
>>  OPTION_MASK_P10_FUSION_2LOGICAL.
>>  * config/rs6000/rs6000.c (rs6000_option_override_internal):
>>  Enable logical-logical fusion for p10.
>>  * config/rs6000/rs6000.opt: Add -mpower10-fusion-2logical.
>> ---
>> gcc/config/rs6000/fusion.md   | 2176 +
>> gcc/config/rs6000/genfusion.pl|   89 ++
>> gcc/config/rs6000/rs6000-cpus.def |4 +-
>> gcc/config/rs6000/rs6000.c|3 +
>> gcc/config/rs6000/rs6000.opt  |4 +
>> 5 files changed, 2275 insertions(+), 1 deletion(-)
>> 
>> diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
>> index a4d3a6ae7f3..1ddbe7fe3d2 100644
>> --- a/gcc/config/rs6000/fusion.md
>> +++ b/gcc/config/rs6000/fusion.md
>> @@ -355,3 +355,2179 @@ (define_insn_and_split 
>> "*lbz_cmpldi_cr0_QI_GPR_CCUNS_zero"
>>   (set_attr "cost" "8")
>>   (set_attr "length" "8")])
>> 
>> +
>> +;; logical-logical fusion pattern generated by gen_2logical
>> +;; kind: scalar outer: and op and rtl and inv 0 comp 0
>> +;; inner: and op and rtl and inv 0 comp 0
>> +(define_insn "*fuse_and_and"
>> +  [(set (match_operand:GPR 3 "gpc_reg_operand" "=,0,1,r")
>> +(and:GPR (and:GPR (match_operand:GPR 0 "gpc_reg_operand" "r,r,r,r") 
>> (match_operand:GPR 1 "gpc_reg_operand" "%r,r,r,r")) (match_operand:GPR 2 
>> "gpc_reg_operand" "r,r,r,r")))
>> +   (clobber (match_scratch:GPR 4 "=X,X,X,r"))]
>> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_2LOGICAL)"
>> +  "@
>> +   and %3,%1,%0\;and %3,%3,%2
>> +   and %0,%1,%0\;and %0,%0,%2
>> +   and %1,%1,%0\;and %1,%1,%2
>> +   and %4,%1,%0\;and %3,%4,%2"
>> +  [(set_attr "type" "logical")
>> +   (set_attr "cost" "6")
>> +   (set_attr "length" "8")])
>> +
>> +;; logical-logical fusion pattern generated by gen_2logical
>> +;; kind: scalar outer: and op and rtl and inv 0 comp 0
>> +;; inner: andc op andc rtl and inv 0 comp 1
>> +(define_insn "*fuse_andc_and"
>> +  [(set (match_operand:GPR 3 "gpc_reg_operand" "=,0,1,r")
>> +(and:GPR (and:GPR (not:GPR (match_operand:GPR 0 "gpc_reg_operand" 
>> "r,r,r,r")) (match_operand:GPR 1 "gpc_reg_operand" "r,r,r,r")) 
>> (match_operand:GPR 2 "gpc_reg_operand" "r,r,r,r")))
>> +   (clobber (match_scratch:GPR 4 "=X,X,X,r"))]
>> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_2LOGICAL)"
>> +  "@
>> +   andc %3,%1,%0\;and %3,%3,%2
>> +   andc %0,%1,%0\;and %0,%0,%2
>> +   andc %1,%1,%0\;and %1,%1,%2
>> +   andc %4,%1,%0\;and %3,%4,%2"
>> +  [(set_attr "type" "logical")
>> +   (set_attr "cost" "6")
>> +   (set_attr "length" "8")])
>> +
>> +;; logical-logical fusion pattern generated by gen_2logical
>> +;; kind: scalar outer: and op and rtl and inv 0 comp 0
>> +;; inner: eqv op eqv rtl xor inv 1 comp 0
>> +(define_insn "*fuse_eqv_and"
>> +  [(set (match_operand:GPR 3 "gpc_reg_operand" "=,0,1,r")
>> +(and:GPR (not:GPR (xor:GPR (match_operand:GPR 0 "gpc_reg_operand" 
>> "r,r,r,r") (match_operand:GPR 1 "gpc_reg_operand" "r,r,r,r"))) 
>> (match_operand:GPR 2 "gpc_reg_operand" "r,r,r,r")))
>> +   (clobber (match_scratch:GPR 4 "=X,X,X,r"))]
>> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_2LOGICAL)"
>> +  "@
>> +   eqv %3,%1,%0\;and %3,%3,%2
>> +   eqv %0,%1,%0\;and %0,%0,%2
>> +   eqv %1,%1,%0\;and %1,%1,%2
>> +   eqv %4,%1,%0\;and %3,%4,%2"
>> +  [(set_attr "type" "logical")
>> +   (set_attr "cost" "6")
>> +   (set_attr "length" "8")])
>> +
>> +;; logical-logical fusion pattern generated by gen_2logical
>> +;; kind: scalar outer: and op and rtl and inv 0 comp 0
>> +;; inner: nand op nand rtl ior inv 0 comp 3
>> +(define_insn "*fuse_nand_and"
>> +  [(set (match_operand:GPR 3 "gpc_reg_operand" "=,0,1,r")
>> +(and:GPR (ior:GPR (not:GPR (match_operand:GPR 0 "gpc_reg_operand" 
>> "r,r,r,r")) (not:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r,r,r"))) 
>> (match_operand:GPR 2 "gpc_reg_operand" "r,r,r,r")))
>> +   (clobber (match_scratch:GPR 4 "=X,X,X,r"))]
>> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_2LOGICAL)"
>> +  "@
>> +   nand %3,%1,%0\;and %3,%3,%2
>> +   nand %0,%1,%0\;and %0,%0,%2
>> +   nand %1,%1,%0\;and 

Re: [PATCH,rs6000] Combine patterns for p10 load-cmpi fusion

2021-01-18 Thread Aaron Sawdey via Gcc-patches
Ping.

Aaron Sawdey, Ph.D. saw...@linux.ibm.com
IBM Linux on POWER Toolchain
 

> On Jan 3, 2021, at 2:42 PM, Aaron Sawdey  wrote:
> 
> Ping.
> 
> I assume we’re going to want a separate patch for the new instruction type.
> 
> Aaron Sawdey, Ph.D. saw...@linux.ibm.com
> IBM Linux on POWER Toolchain
> 
> 
>> On Dec 4, 2020, at 1:19 PM, acsaw...@linux.ibm.com wrote:
>> 
>> From: Aaron Sawdey 
>> 
>> This patch adds the first batch of patterns to support p10 fusion. These
>> will allow combine to create a single insn for a pair of instructions
>> that that power10 can fuse and execute. These particular ones have the
>> requirement that only cr0 can be used when fusing a load with a compare
>> immediate of -1/0/1 (if signed) or 0/1 (if unsigned), so we want combine
>> to put that requirement in, and if it doesn't work out later the splitter
>> can get used.
>> 
>> The patterns are generated by a script genfusion.pl and live in new file
>> fusion.md. This script will be expanded to generate more patterns for
>> fusion.
>> 
>> This also adds option -mpower10-fusion which defaults on for power10 and
>> will gate all these fusion patterns. In addition I have added an
>> undocumented option -mpower10-fusion-ld-cmpi (which may be removed later)
>> that just controls the load+compare-immediate patterns. I have make
>> these default on for power10 but they are not disallowed for earlier
>> processors because it is still valid code. This allows us to test the
>> correctness of fusion code generation by turning it on explicitly.
>> 
>> If bootstrap/regtest is clean, ok for trunk?
>> 
>> Thanks!
>> 
>>  Aaron
>> 
>> gcc/ChangeLog:
>> 
>>  * config/rs6000/genfusion.pl: New file, script to generate
>>  define_insn_and_split patterns so combine can arrange fused
>>  instructions next to each other.
>>  * config/rs6000/fusion.md: New file, generated fused instruction
>>  patterns for combine.
>>  * config/rs6000/predicates.md (const_m1_to_1_operand): New predicate.
>>  (non_update_memory_operand): New predicate.
>>  * config/rs6000/rs6000-cpus.def: Add OPTION_MASK_P10_FUSION and
>>  OPTION_MASK_P10_FUSION_LD_CMPI to ISA_3_1_MASKS_SERVER and
>>  POWERPC_MASKS.
>>  * config/rs6000/rs6000-protos.h (address_is_non_pfx_d_or_x): Add
>>  prototype.
>>  * config/rs6000/rs6000.c (rs6000_option_override_internal):
>>  automatically set -mpower10-fusion and -mpower10-fusion-ld-cmpi
>>  if target is power10.  (rs600_opt_masks): Allow -mpower10-fusion
>>  in function attributes.  (address_is_non_pfx_d_or_x): New function.
>>  * config/rs6000/rs6000.h: Add MASK_P10_FUSION.
>>  * config/rs6000/rs6000.md: Include fusion.md.
>>  * config/rs6000/rs6000.opt: Add -mpower10-fusion
>>  and -mpower10-fusion-ld-cmpi.
>>  * config/rs6000/t-rs6000: Add dependencies involving fusion.md.
>> ---
>> gcc/config/rs6000/fusion.md   | 357 ++
>> gcc/config/rs6000/genfusion.pl| 144 
>> gcc/config/rs6000/predicates.md   |  14 ++
>> gcc/config/rs6000/rs6000-cpus.def |   6 +-
>> gcc/config/rs6000/rs6000-protos.h |   2 +
>> gcc/config/rs6000/rs6000.c|  51 +
>> gcc/config/rs6000/rs6000.h|   1 +
>> gcc/config/rs6000/rs6000.md   |   1 +
>> gcc/config/rs6000/rs6000.opt  |   8 +
>> gcc/config/rs6000/t-rs6000|   6 +-
>> 10 files changed, 588 insertions(+), 2 deletions(-)
>> create mode 100644 gcc/config/rs6000/fusion.md
>> create mode 100755 gcc/config/rs6000/genfusion.pl
>> 
>> diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
>> new file mode 100644
>> index 000..a4d3a6ae7f3
>> --- /dev/null
>> +++ b/gcc/config/rs6000/fusion.md
>> @@ -0,0 +1,357 @@
>> +;; -*- buffer-read-only: t -*-
>> +;; Generated automatically by genfusion.pl
>> +
>> +;; Copyright (C) 2020 Free Software Foundation, Inc.
>> +;;
>> +;; This file is part of GCC.
>> +;;
>> +;; GCC is free software; you can redistribute it and/or modify it under
>> +;; the terms of the GNU General Public License as published by the Free
>> +;; Software Foundation; either version 3, or (at your option) any later
>> +;; version.
>> +;;
>> +;; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>> +;; WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> +;; FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>> +;; for more details.
>> +;;
>> +;; You should have received a copy of the GNU General Public License
>> +;; along with GCC; see the file COPYING3.  If not see
>> +;; .
>> +
>> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
>> +;; load mode is DI result mode is clobber compare mode is CC extend is none
>> +(define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
>> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
>> +(compare:CC (match_operand:DI 1 "non_update_memory_operand" "m")
>> + 

Re: [PATCH] RISC-V: The 'multilib-generator' enhancement.

2021-01-18 Thread Kito Cheng via Gcc-patches
Hi Geng Qi:

Thanks for your patch, committed!


On Mon, Jan 18, 2021 at 3:01 PM Geng Qi via Gcc-patches
 wrote:
>
> From: gengqi 
>
> Think about this case:
>   ./multilib-generator rv32imc-ilp32-rv32imac,rv32imacxthead-f
> Here are 2 problems:
>   1. A unexpected 'xtheadf' extension was made.
>   2. The arch 'rv32imac' was not be created.
> This modification fix these two, and also sorts 'multi-letter'.
>
> gcc/ChangeLog:
> * config/riscv/arch-canonicalize
> (longext_sort): New function for sorting 'multi-letter'.
> * config/riscv/multilib-generator: Adjusting the loop of 'alt' in
> 'alts'. The 'arch' may not be the first of 'alts'.
> (_expand_combination): Add underline for the 'ext' without '*'.
> This is because, a single-letter extension can always be treated well
> with a '_' prefix, but it cannot be separated out if it is appended
> to a multi-letter.
> ---
>  gcc/config/riscv/arch-canonicalize  | 14 +-
>  gcc/config/riscv/multilib-generator | 12 +++-
>  2 files changed, 20 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/config/riscv/arch-canonicalize 
> b/gcc/config/riscv/arch-canonicalize
> index 2b4289e..a1e4570 100755
> --- a/gcc/config/riscv/arch-canonicalize
> +++ b/gcc/config/riscv/arch-canonicalize
> @@ -74,8 +74,20 @@ def arch_canonicalize(arch):
># becasue we just append extensions list to the arch string.
>std_exts += list(filter(lambda x:len(x) == 1, long_exts))
>
> +  def longext_sort (exts):
> +if not exts.startswith("zxm") and exts.startswith("z"):
> +  # If "Z" extensions are named, they should be ordered first by 
> CANONICAL.
> +  if exts[1] not in CANONICAL_ORDER:
> +raise Exception("Unsupported extension `%s`" % exts)
> +  canonical_sort = CANONICAL_ORDER.index(exts[1])
> +else:
> +  canonical_sort = -1
> +return (exts.startswith("x"), exts.startswith("zxm"),
> +LONG_EXT_PREFIXES.index(exts[0]), canonical_sort, exts[1:])
> +
># Multi-letter extension must be in lexicographic order.
> -  long_exts = list(sorted(filter(lambda x:len(x) != 1, long_exts)))
> +  long_exts = list(sorted(filter(lambda x:len(x) != 1, long_exts),
> +  key=longext_sort))
>
># Put extensions in canonical order.
>for ext in CANONICAL_ORDER:
> diff --git a/gcc/config/riscv/multilib-generator 
> b/gcc/config/riscv/multilib-generator
> index 64ff15f..7b22537 100755
> --- a/gcc/config/riscv/multilib-generator
> +++ b/gcc/config/riscv/multilib-generator
> @@ -68,15 +68,15 @@ def arch_canonicalize(arch):
>  def _expand_combination(ext):
>exts = list(ext.split("*"))
>
> -  # No need to expand if there is no `*`.
> -  if len(exts) == 1:
> -return [(exts[0],)]
> -
># Add underline to every extension.
># e.g.
>#  _b * zvamo => _b * _zvamo
>exts = list(map(lambda x: '_' + x, exts))
>
> +  # No need to expand if there is no `*`.
> +  if len(exts) == 1:
> +return [(exts[0],)]
> +
># Generate combination!
>ext_combs = []
>for comb_len in range(1, len(exts)+1):
> @@ -147,7 +147,9 @@ for cfg in sys.argv[1:]:
># Drop duplicated entry.
>alts = unique(alts)
>
> -  for alt in alts[1:]:
> +  for alt in alts:
> +if alt == arch:
> +  continue
>  arches[alt] = 1
>  reuse.append('march.%s/mabi.%s=march.%s/mabi.%s' % (arch, abi, alt, abi))
>required.append('march=%s/mabi=%s' % (arch, abi))
> --
> 2.7.4
>


Re: [PATCH] [PR rtl/optimization/98694] Fix incorrect optimization by cprop_hardreg.

2021-01-18 Thread Hongtao Liu via Gcc-patches
On Mon, Jan 18, 2021 at 7:10 PM Richard Sandiford
 wrote:
>
> Hongtao Liu  writes:
> > On Mon, Jan 18, 2021 at 6:18 PM Richard Sandiford
> >  wrote:
> >>
> >> Hongtao Liu via Gcc-patches  writes:
> >> > Hi:
> >> >   If SRC had been assigned a mode narrower than the copy, we can't link
> >> > DEST into the chain even they have same
> >> > hard_regno_nregs(i.e. HImode/SImode in i386 backend).
> >>
> >> In general, changes between modes within the same hard register are OK.
> >> Could you explain in more detail what's going wrong?
> >>
> >
> > cprop hardreg change
> >
> > (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
> > (reg:SI 37 r9 [orig:86 _11 ] [86])) "test.c":29:36 75 
> > {*movsi_internal}
> >  (expr_list:REG_DEAD (reg:SI 37 r9 [orig:86 _11 ] [86])
> > (nil)))
> >
> > to
> >
> > (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
> > (reg:SI 22 xmm2 [orig:86 _11 ] [86])) "test.c":29:36 75
> > {*movsi_internal}
> >  (expr_list:REG_DEAD (reg:SI 22 xmm2 [orig:86 _11 ] [86])
> > (nil)))
> >
> > since (reg:SI 22 xmm2) and (reg:SI r9) are in the same value chain in
> > which the oldest regno is k0.
> >
> > but with xmm2 defined as
> >
> > kmovw %k0, %edi  # 69 [c=4 l=4] *movhi_internal/6- kmovw move the
> > lower 16bits to %edi, and clear the upper 16 bits.
> > vmovd %edi, %xmm2 # 489 *movsi_internal  --- vmovd move 32bits from
> > %edi to %xmm2.
> >
> > (insn 69 68 70 12 (set (reg:HI 5 di [orig:96 _52 ] [96])
> > (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "test.c":21:23 76
> > {*movhi_internal}
> >  (nil))
> >
> > (insn 489 75 78 12 (set (reg:SI 22 xmm2 [297])
> > (reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal}
> >  (nil))
>
> The sequence is OK in itself, but insn 489 can't make any assumptions
> about what's in the upper 16 bits of %edi.  In other words, as far as
> RTL semantics are concerned, insn 489 only leaves bits 0-15 of %xmm2
> with defined values; the other bits are undefined.
>
> If the target wants all 32 bits of %edi to be carried over to insn 489
> then it needs to make insn 69 an SImode set instead of a HImode set.
>

actually only the lower 16bits are needed, the original insn is like

.294.r.ira
(insn 69 68 70 13 (set (reg:HI 96 [ _52 ])
(subreg:HI (reg:DI 82 [ var_6.0_1 ]) 0)) "test.c":21:23 76
{*movhi_internal}
 (nil))
(insn 78 75 82 13 (set (reg:V4HI 140 [ _283 ])
(vec_duplicate:V4HI (truncate:HI (subreg:SI (reg:HI 96 [ _52
]) 0 1412 {*vec_dupv4hi}
 (nil))

.295r.reload
(insn 69 68 70 13 (set (reg:HI 5 di [orig:96 _52 ] [96])
(reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "test.c":21:23 76
{*movhi_internal}
 (nil))
(insn 489 75 78 13 (set (reg:SI 22 xmm2 [297])
(reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal}
 (nil))
(insn 78 489 490 13 (set (reg:V4HI 20 xmm0 [orig:140 _283 ] [140])
(vec_duplicate:V4HI (truncate:HI (reg:SI 22 xmm2 [297]
1412 {*vec_dupv4hi}
 (nil))

and insn 489 is created by lra/reload which seems ok for the sequence,
but problemistic with considering the logic of hardreg_cprop.

> So what cprop is doing is OK: it's changing the values of undefined
> bits but not changing the definition of defined bits (from an RTL
> point of view).
>
> Thanks,
> Richard



-- 
BR,
Hongtao


Go patch committed: Read embedcfg file, parse go:embed directives

2021-01-18 Thread Ian Lance Taylor via Gcc-patches
This patch to the Go frontend reads go:embed directives and attaches
them to variables.  It also reads the embedcfg file passed on the
command line.  We still don't actually do anything with the
directives.  Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.
Committed to mainline.

Ian
f6b3e2c2f626e9a84a3e37bc60bdb133bbd2a6e0
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 82f43f5f21f..fb4ec30913e 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-22ce16e28220d446c4557f47129024e3561f3d77
+9e78cef2b689aa586dbf677fb47ea3f08f197b91
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/embed.cc b/gcc/go/gofrontend/embed.cc
index 19c6930d0c3..7ee86746212 100644
--- a/gcc/go/gofrontend/embed.cc
+++ b/gcc/go/gofrontend/embed.cc
@@ -626,3 +626,18 @@ Embedcfg_reader::error(const char* msg)
"%<-fgo-embedcfg%>: %s: %s",
this->filename_, msg);
 }
+
+// Return whether the current file imports "embed".
+
+bool
+Gogo::is_embed_imported() const
+{
+  Packages::const_iterator p = this->packages_.find("embed");
+  if (p == this->packages_.end())
+return false;
+
+  // We track current file imports in the package aliases, where a
+  // typical import will just list the package name in aliases.  So
+  // the package has been imported if there is at least one alias.
+  return !p->second->aliases().empty();
+}
diff --git a/gcc/go/gofrontend/go.cc b/gcc/go/gofrontend/go.cc
index e026d6592ba..404cb124549 100644
--- a/gcc/go/gofrontend/go.cc
+++ b/gcc/go/gofrontend/go.cc
@@ -40,6 +40,8 @@ go_create_gogo(const struct go_create_gogo_args* args)
 ::gogo->set_compiling_runtime(args->compiling_runtime);
   if (args->c_header != NULL)
 ::gogo->set_c_header(args->c_header);
+  if (args->embedcfg != NULL)
+::gogo->read_embedcfg(args->embedcfg);
   ::gogo->set_debug_escape_level(args->debug_escape_level);
   if (args->debug_escape_hash != NULL)
 ::gogo->set_debug_escape_hash(args->debug_escape_hash);
diff --git a/gcc/go/gofrontend/gogo.cc b/gcc/go/gofrontend/gogo.cc
index fbf8935bb06..4c795a2b495 100644
--- a/gcc/go/gofrontend/gogo.cc
+++ b/gcc/go/gofrontend/gogo.cc
@@ -7456,8 +7456,8 @@ Variable::Variable(Type* type, Expression* init, bool 
is_global,
   bool is_parameter, bool is_receiver,
   Location location)
   : type_(type), init_(init), preinit_(NULL), location_(location),
-backend_(NULL), is_global_(is_global), is_parameter_(is_parameter),
-is_closure_(false), is_receiver_(is_receiver),
+embeds_(NULL), backend_(NULL), is_global_(is_global),
+is_parameter_(is_parameter), is_closure_(false), is_receiver_(is_receiver),
 is_varargs_parameter_(false), is_global_sink_(false), is_used_(false),
 is_address_taken_(false), is_non_escaping_address_taken_(false),
 seen_(false), init_is_lowered_(false), init_is_flattened_(false),
diff --git a/gcc/go/gofrontend/gogo.h b/gcc/go/gofrontend/gogo.h
index 0d80bdeda4f..891ef697ffe 100644
--- a/gcc/go/gofrontend/gogo.h
+++ b/gcc/go/gofrontend/gogo.h
@@ -397,6 +397,10 @@ class Gogo
   void
   read_embedcfg(const char* filename);
 
+  // Return whether the current file imports "embed".
+  bool
+  is_embed_imported() const;
+
   // Return whether to check for division by zero in binary operations.
   bool
   check_divide_by_zero() const
@@ -2276,6 +2280,16 @@ class Variable
 this->is_referenced_by_inline_ = true;
   }
 
+  // Attach any go:embed comments for this variable.
+  void
+  set_embeds(std::vector* embeds)
+  {
+go_assert(this->is_global_
+ && this->init_ == NULL
+ && this->preinit_ == NULL);
+this->embeds_ = embeds;
+  }
+
   // Return the top-level declaration for this variable.
   Statement*
   toplevel_decl()
@@ -2346,6 +2360,8 @@ class Variable
   Block* preinit_;
   // Location of variable definition.
   Location location_;
+  // Any associated go:embed comments.
+  std::vector* embeds_;
   // Backend representation.
   Bvariable* backend_;
   // Whether this is a global variable.
diff --git a/gcc/go/gofrontend/lex.cc b/gcc/go/gofrontend/lex.cc
index 0baf4e4e24b..dd66c0209a4 100644
--- a/gcc/go/gofrontend/lex.cc
+++ b/gcc/go/gofrontend/lex.cc
@@ -2035,6 +2035,8 @@ Lex::skip_cpp_comment()
  (*this->linknames_)[go_name] = Linkname(ext_name, is_exported, loc);
}
 }
+  else if (verb == "go:embed")
+this->gather_embed(ps, pend);
   else if (verb == "go:nointerface")
 {
   // For field tracking analysis: a //go:nointerface comment means
@@ -2111,6 +2113,98 @@ Lex::skip_cpp_comment()
 }
 }
 
+// Read a go:embed directive.  This is a series of space-separated
+// patterns.  Each pattern may be a quoted or backquoted string.
+
+void
+Lex::gather_embed(const char *p, const char *pend)
+{
+  while (true)
+{
+  // Skip spaces to find the start of the next 

[committed] Minor fix to pr41445-7 testcase

2021-01-18 Thread Jeff Law via Gcc-patches

The addition of the dg-skip-if for AIX changes the line number for the
variable declarations in the file which are checked by the dg-final
directives.  Thus those directives need a trivial update.

Committed to the trunk

Jeff
commit 9a3ab93ceb23fbe45bfbc597d88f208fe092ea14
Author: Jeff Law 
Date:   Mon Jan 18 16:04:11 2021 -0700

[committed] Minor fix to pr41445-7 testcase

gcc/testsuite
* gcc.dg/debug/dwarf2/pr41445-7.c: Fix expected output.

diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-7.c 
b/gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-7.c
index d1e8f46e840..1120c6db24d 100644
--- a/gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-7.c
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-7.c
@@ -13,5 +13,5 @@ int A(B) ;
 /*  We want to check that both vari and varj have the same line
 number.  */
 
-/* { dg-final { scan-assembler 
"DW_TAG_variable\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\"vari\[^\\r\\n\]*DW_AT_name(\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*DW_AT_)*\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\[^\\r\\n\]*DW_AT_decl_line
 \\((0xa|10)\\)" } } */
-/* { dg-final { scan-assembler 
"DW_TAG_variable\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\"varj\[^\\r\\n\]*DW_AT_name(\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*DW_AT_)*\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\[^\\r\\n\]*DW_AT_decl_line
 \\((0xa|10)\\)" } } */
+/* { dg-final { scan-assembler 
"DW_TAG_variable\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\"vari\[^\\r\\n\]*DW_AT_name(\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*DW_AT_)*\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\[^\\r\\n\]*DW_AT_decl_line
 \\((0xb|11)\\)" } } */
+/* { dg-final { scan-assembler 
"DW_TAG_variable\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\"varj\[^\\r\\n\]*DW_AT_name(\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*DW_AT_)*\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\[^\\r\\n\]*DW_AT_decl_line
 \\((0xb|11)\\)" } } */


libbacktrace patch committed: Don't fail tests if dwz fails

2021-01-18 Thread Ian Lance Taylor via Gcc-patches
On my system the current version of dwz fails on some DWARF 5 input.
This is reportedly fixed by
https://sourceware.org/pipermail/dwz/2021q1/000775.html, but in the
meantime there is no reason for the libbacktrace testsuite to fail
just because dwz fails.  This patch changes the Makefile so that if
dwz fails, we just use the uncompressed debug info.  The test becomes
meaningless, but at least it passes.  And it will continue to test dwz
information for cases where dwz works.  Bootstrapped and ran
libbacktrace tests on x86_64-pc-linux-gnu.  Committed to mainline.

(This commit also regenerates configure, which was not correct
regenerated by an earlier commit.  The only difference is some #line
directives.)

Ian

* Makefile.am (%_dwz): If dwz fails, use uncompressed debug info.
* Makefile.in: Regenerate.
* configure: Regenerate.
bfde774667fbce6d7d326c8a36a098138e224a95
diff --git a/libbacktrace/Makefile.am b/libbacktrace/Makefile.am
index e1e55009f09..8874f41338a 100644
--- a/libbacktrace/Makefile.am
+++ b/libbacktrace/Makefile.am
@@ -303,9 +303,13 @@ if HAVE_DWZ
rm -f $@ $@_common.debug
cp $< $@_1
cp $< $@_2
-   $(DWZ) -m $@_common.debug $@_1 $@_2
-   rm -f $@_2
-   mv $@_1 $@
+   if $(DWZ) -m $@_common.debug $@_1 $@_2; then \
+ rm -f $@_2; \
+ mv $@_1 $@; \
+   else \
+ echo "Ignoring dwz errors, assuming that test passes"; \
+ cp $< $@; \
+   fi
 
 TESTS += btest_dwz
 


libbacktrace patch committed: Use correct DWARF-5 filename index

2021-01-18 Thread Ian Lance Taylor via Gcc-patches
This libbacktrace patch uses the correct directory and filename index
for DWARF 5.  For DWARF 4 and before, the zero entry for the directory
and filename information stored in the line program came from the
compilation unit.  Because of that, the old code used to handle zero
specially, and otherwise subtract one from the index.  For DWARF 5,
the zero entry is actually present in the tables, so it is no longer
appropriate to subtract one.  To make this work in the simplest
manner, just always store the zero entry in the tables, and stop
treating zero specially, and stop subtracting one.  For DWARF 4 and
before, fetch the zero entry from the compilation unit.  Bootstrapped
on x86_64-pc-linux-gnu.  The libbacktrace tests all pass.  The libgo
test all pass except for the ones that fail due to PR 98708.
Committed to mainline.

Ian

* dwarf.c (read_v2_paths): Allocate zero entry for dirs and
filenames.
(read_line_program): Remove parameter u, change caller.  Don't
subtract one from dirs and filenames index.
(read_function_entry): Don't subtract one from filenames index.
4817984f0f79656698e8b380e524f56a53881f15
diff --git a/libbacktrace/dwarf.c b/libbacktrace/dwarf.c
index 3d0cbedf770..9097df6cc76 100644
--- a/libbacktrace/dwarf.c
+++ b/libbacktrace/dwarf.c
@@ -2344,19 +2344,20 @@ read_v2_paths (struct backtrace_state *state, struct 
unit *u,
   ++hdr->dirs_count;
 }
 
-  hdr->dirs = NULL;
-  if (hdr->dirs_count != 0)
-{
-  hdr->dirs = ((const char **)
-  backtrace_alloc (state,
-   hdr->dirs_count * sizeof (const char *),
-   hdr_buf->error_callback,
-   hdr_buf->data));
-  if (hdr->dirs == NULL)
-   return 0;
-}
+  /* The index of the first entry in the list of directories is 1.  Index 0 is
+ used for the current directory of the compilation.  To simplify index
+ handling, we set entry 0 to the compilation unit directory.  */
+  ++hdr->dirs_count;
+  hdr->dirs = ((const char **)
+  backtrace_alloc (state,
+   hdr->dirs_count * sizeof (const char *),
+   hdr_buf->error_callback,
+   hdr_buf->data));
+  if (hdr->dirs == NULL)
+return 0;
 
-  i = 0;
+  hdr->dirs[0] = u->comp_dir;
+  i = 1;
   while (*hdr_buf->buf != '\0')
 {
   if (hdr_buf->reported_underflow)
@@ -2383,6 +2384,10 @@ read_v2_paths (struct backtrace_state *state, struct 
unit *u,
   ++hdr->filenames_count;
 }
 
+  /* The index of the first entry in the list of file names is 1.  Index 0 is
+ used for the DW_AT_name of the compilation unit.  To simplify index
+ handling, we set entry 0 to the compilation unit file name.  */
+  ++hdr->filenames_count;
   hdr->filenames = ((const char **)
backtrace_alloc (state,
 hdr->filenames_count * sizeof (char *),
@@ -2390,7 +2395,8 @@ read_v2_paths (struct backtrace_state *state, struct unit 
*u,
 hdr_buf->data));
   if (hdr->filenames == NULL)
 return 0;
-  i = 0;
+  hdr->filenames[0] = u->filename;
+  i = 1;
   while (*hdr_buf->buf != '\0')
 {
   const char *filename;
@@ -2404,7 +2410,7 @@ read_v2_paths (struct backtrace_state *state, struct unit 
*u,
return 0;
   dir_index = read_uleb128 (hdr_buf);
   if (IS_ABSOLUTE_PATH (filename)
- || (dir_index == 0 && u->comp_dir == NULL))
+ || (dir_index < hdr->dirs_count && hdr->dirs[dir_index] == NULL))
hdr->filenames[i] = filename;
   else
{
@@ -2413,10 +2419,8 @@ read_v2_paths (struct backtrace_state *state, struct 
unit *u,
  size_t filename_len;
  char *s;
 
- if (dir_index == 0)
-   dir = u->comp_dir;
- else if (dir_index - 1 < hdr->dirs_count)
-   dir = hdr->dirs[dir_index - 1];
+ if (dir_index < hdr->dirs_count)
+   dir = hdr->dirs[dir_index];
  else
{
  dwarf_buf_error (hdr_buf,
@@ -2704,8 +2708,8 @@ read_line_header (struct backtrace_state *state, struct 
dwarf_data *ddata,
 
 static int
 read_line_program (struct backtrace_state *state, struct dwarf_data *ddata,
-  struct unit *u, const struct line_header *hdr,
-  struct dwarf_buf *line_buf, struct line_vector *vec)
+  const struct line_header *hdr, struct dwarf_buf *line_buf,
+  struct line_vector *vec)
 {
   uint64_t address;
   unsigned int op_index;
@@ -2715,8 +2719,8 @@ read_line_program (struct backtrace_state *state, struct 
dwarf_data *ddata,
 
   address = 0;
   op_index = 0;
-  if (hdr->filenames_count > 0)
-reset_filename = hdr->filenames[0];
+  if (hdr->filenames_count > 1)
+reset_filename = hdr->filenames[1];
   else
 reset_filename = "";
   filename = reset_filename;
@@ -2781,10 +2785,8 @@ read_line_program 

Re: [PATCH] c++: ICE with USING_DECL redeclaration [PR98687]

2021-01-18 Thread Jason Merrill via Gcc-patches

On 1/15/21 12:26 AM, Marek Polacek wrote:

My recent patch that introduced push_using_decl_bindings didn't
handle USING_DECL redeclaration, therefore things broke.  This
patch amends that.  Note that I don't know if the other parts of
finish_nonmember_using_decl are needed (e.g. the binding->type
setting) -- I couldn't trigger it by any of my hand-made testcases.


I'd expect it to be exercised by something along the lines of

struct A { };

void f()
{
  int A;
  using ::A;
  struct A a;
}

Let's factor the code out of finish_nonmember_using_decl rather than 
copy it.



Sorry for not thinking harder about redeclarations in the original
patch :(.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/98687
* name-lookup.c (push_using_decl_bindings): If we found an
existing local binding, update it if it's not identical.

gcc/testsuite/ChangeLog:

PR c++/98687
* g++.dg/lookup/using64.C: New test.
* g++.dg/lookup/using65.C: New test.
---
  gcc/cp/name-lookup.c  | 20 -
  gcc/testsuite/g++.dg/lookup/using64.C | 60 +++
  gcc/testsuite/g++.dg/lookup/using65.C | 17 
  3 files changed, 95 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/lookup/using64.C
  create mode 100644 gcc/testsuite/g++.dg/lookup/using65.C

diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index b4b6c0b81b5..857d90914ca 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -9285,8 +9285,24 @@ push_operator_bindings ()
  void
  push_using_decl_bindings (tree decl)
  {
-  push_local_binding (DECL_NAME (decl), USING_DECL_DECLS (decl),
- /*using*/true);
+  tree name = DECL_NAME (decl);
+  tree value = USING_DECL_DECLS (decl);
+
+  cxx_binding *binding = find_local_binding (current_binding_level, name);
+  if (binding)
+{
+  if (value == binding->value)
+   /* Redeclaration of this USING_DECL.  */;
+  else if (binding->value && TREE_CODE (value) == OVERLOAD)
+   {
+ /* We already have this binding, so replace it.  */
+ update_local_overload (IDENTIFIER_BINDING (name), value);
+ IDENTIFIER_BINDING (name)->value = value;
+   }
+}
+  else
+/* Install the new binding.  */
+push_local_binding (DECL_NAME (decl), value, /*using*/true);
  }
  
  #include "gt-cp-name-lookup.h"

diff --git a/gcc/testsuite/g++.dg/lookup/using64.C 
b/gcc/testsuite/g++.dg/lookup/using64.C
new file mode 100644
index 000..42bce331e19
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lookup/using64.C
@@ -0,0 +1,60 @@
+// PR c++/98687
+// { dg-do compile }
+
+struct S { };
+
+namespace N {
+  template 
+  bool operator==(T, int);
+
+  template 
+  void X(T);
+}
+
+namespace M {
+  template 
+  bool operator==(T, double);
+}
+
+template
+bool fn1 (T t)
+{
+  using N::operator==;
+  return t == 1;
+}
+
+template
+bool fn2 (T t)
+{
+  // Redeclaration.
+  using N::operator==;
+  using N::operator==;
+  return t == 1;
+}
+
+template
+bool fn3 (T t)
+{
+  // Need update_local_overload.
+  using N::operator==;
+  using M::operator==;
+  return t == 1;
+}
+
+template
+void fn4 (T t)
+{
+  struct X { };
+  using N::X;
+  X(1);
+}
+
+void
+g ()
+{
+  S s;
+  fn1 (s);
+  fn2 (s);
+  fn3 (s);
+  fn4 (s);
+}
diff --git a/gcc/testsuite/g++.dg/lookup/using65.C 
b/gcc/testsuite/g++.dg/lookup/using65.C
new file mode 100644
index 000..bc6c086197f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lookup/using65.C
@@ -0,0 +1,17 @@
+// PR c++/98687
+// { dg-do compile }
+
+extern "C" namespace std {
+  double log1p(double);
+}
+namespace std_fallback {
+  template  void log1p();
+}
+template  struct log1p_impl {
+  static int run() {
+using std::log1p;
+using std_fallback::log1p;
+return 0;
+  }
+};
+void log1p() { log1p_impl::run(); }

base-commit: 5fff80fd79c36a1a940b331d20905061d61ee5e6





[PATCH] fwprop: Allow (subreg (mem)) simplifications

2021-01-18 Thread Ilya Leoshkevich via Gcc-patches
Boostrapped and regtested on x86_64-redhat-linux, ppc64le-redhat-linux
and s390x-redhat-linux.  I realize it might be too late for a change
like this, but it's desirable to have this in conjunction with the
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563799.html s390
regression fix, which otherwise produces unnecessary store/load
sequences in certain glibc routines, e.g. __ieee754_sqrtl.  Ok for
master?



Suppose we have:

(set (reg/v:TF 63) (mem/c:TF (reg/v:DI 62)))
(set (reg:FPRX2 66) (subreg:FPRX2 (reg/v:TF 63) 0))

It is clearly profitable to propagate the first insn into the second
one and get:

(set (reg:FPRX2 66) (mem/c:FPRX2 (reg/v:DI 62)))

fwprop actually manages to perform this, but doesn't think the result is
worth it, which results in unnecessary store/load sequences on s390.
Improve the situation by classifying SUBREG -> MEM changes as
profitable.

gcc/ChangeLog:

2021-01-15  Ilya Leoshkevich  

* fwprop.c (fwprop_propagation::classify_result): Allow
(subreg (mem)) simplifications.

gcc/testsuite/ChangeLog:

2021-01-15  Ilya Leoshkevich  

* gcc.target/s390/vector/long-double-to-i64.c: Expect that
float-vector moves do *not* happen.
---
 gcc/fwprop.c  | 5 +
 gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c | 3 +--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/gcc/fwprop.c b/gcc/fwprop.c
index eff8f7cc141..46b8ec7eccf 100644
--- a/gcc/fwprop.c
+++ b/gcc/fwprop.c
@@ -262,6 +262,11 @@ fwprop_propagation::classify_result (rtx old_rtx, rtx 
new_rtx)
   && GET_MODE (new_rtx) == GET_MODE_INNER (GET_MODE (from)))
 return PROFITABLE;
 
+  /* Allow (subreg (mem)) -> (mem) simplifications.  However, do not allow
+ creating new (mem/v)s, since DCE will not remove the old ones.  */
+  if (SUBREG_P (old_rtx) && MEM_P (new_rtx) && !MEM_VOLATILE_P (new_rtx))
+return PROFITABLE;
+
   return 0;
 }
 
diff --git a/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c 
b/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c
index 2dbbb5d1c03..8f4e377ed72 100644
--- a/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c
+++ b/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c
@@ -10,8 +10,7 @@ long_double_to_i64 (long double x)
   return x;
 }
 
-/* { dg-final { scan-assembler-times {\n\tvpdi\t%v\d+,%v\d+,%v\d+,1\n} 1 } } */
-/* { dg-final { scan-assembler-times {\n\tvpdi\t%v\d+,%v\d+,%v\d+,5\n} 1 } } */
+/* { dg-final { scan-assembler-not {\n\tvpdi\t} } } */
 /* { dg-final { scan-assembler-times {\n\tcgxbr\t} 1 } } */
 
 int
-- 
2.26.2



[PATCH] IBM Z: Fix usage of "f" constraint with long doubles

2021-01-18 Thread Ilya Leoshkevich via Gcc-patches
Bootstrapped and regtested on s390x-redhat-linux.  Depends on
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/562898.html;
ok for master once the dependency is committed?



After switching the s390 backend to store long doubles in vector
registers, "f" constraint broke when used with the former: long doubles
correspond to TFmode, which in combination with "f" corresponds to
hard regs %v0-%v15, however, asm users expect a %f0-%f15 pair.

Fix by using TARGET_MD_ASM_ADJUST hook to convert TFmode values to
FPRX2mode and back.

gcc/ChangeLog:

2020-12-14  Ilya Leoshkevich  

* config/s390/s390.c (s390_md_asm_adjust): Implement
TARGET_MD_ASM_ADJUST.
(TARGET_MD_ASM_ADJUST): Likewise.
* config/s390/vector.md (fprx2_to_tf): Rename from *fprx2_to_tf,
add memory alternative.
(tf_to_fprx2): New pattern.

gcc/testsuite/ChangeLog:

2020-12-14  Ilya Leoshkevich  

* gcc.target/s390/vector/long-double-asm-abi.c: New test.
* gcc.target/s390/vector/long-double-asm-in-out.c: New test.
* gcc.target/s390/vector/long-double-asm-inout.c: New test.
* gcc.target/s390/vector/long-double-volatile-from-i64.c: New
test.
---
 gcc/config/s390/s390.c| 73 +++
 gcc/config/s390/vector.md | 36 +++--
 .../s390/vector/long-double-asm-abi.c | 26 +++
 .../s390/vector/long-double-asm-in-out.c  | 14 
 .../s390/vector/long-double-asm-inout.c   | 14 
 .../vector/long-double-volatile-from-i64.c| 22 ++
 6 files changed, 180 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/long-double-asm-abi.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-asm-in-out.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/long-double-asm-inout.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-volatile-from-i64.c

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 9d2cee950d0..a22fd9fe391 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -16688,6 +16688,76 @@ s390_shift_truncation_mask (machine_mode mode)
   return mode == DImode || mode == SImode ? 63 : 0;
 }
 
+/* Implement TARGET_MD_ASM_ADJUST hook in order to fix up "f"
+   constraints when long doubles are stored in vector registers.  */
+
+static rtx_insn *
+s390_md_asm_adjust (vec , vec ,
+   vec _modes,
+   vec , vec & /*clobbers*/,
+   HARD_REG_SET & /*clobbered_regs*/)
+{
+  if (!TARGET_VXE)
+/* Long doubles are stored in FPR pairs - nothing to do.  */
+return NULL;
+
+  rtx_insn *after_md_seq = NULL, *after_md_end = NULL;
+
+  unsigned ninputs = inputs.length ();
+  unsigned noutputs = outputs.length ();
+  for (unsigned i = 0; i < noutputs; i++)
+{
+  if (GET_MODE (outputs[i]) != TFmode)
+   /* Not a long double - nothing to do.  */
+   continue;
+  const char *constraint = constraints[i];
+  bool allows_mem, allows_reg, is_inout;
+  bool ok = parse_output_constraint (, i, ninputs, noutputs,
+_mem, _reg, _inout);
+  gcc_assert (ok);
+  if (strcmp (constraint, "=f") != 0)
+   /* Long double with a constraint other than "=f" - nothing to do.  */
+   continue;
+  gcc_assert (allows_reg);
+  gcc_assert (!allows_mem);
+  gcc_assert (!is_inout);
+  /* Copy output value from a FPR pair into a vector register.  */
+  rtx fprx2 = gen_reg_rtx (FPRX2mode);
+  push_to_sequence2 (after_md_seq, after_md_end);
+  emit_insn (gen_fprx2_to_tf (outputs[i], fprx2));
+  after_md_seq = get_insns ();
+  after_md_end = get_last_insn ();
+  end_sequence ();
+  outputs[i] = fprx2;
+}
+
+  for (unsigned i = 0; i < ninputs; i++)
+{
+  if (GET_MODE (inputs[i]) != TFmode)
+   /* Not a long double - nothing to do.  */
+   continue;
+  const char *constraint = constraints[noutputs + i];
+  bool allows_mem, allows_reg;
+  bool ok = parse_input_constraint (, i, ninputs, noutputs, 0,
+   constraints.address (), _mem,
+   _reg);
+  gcc_assert (ok);
+  if (strcmp (constraint, "f") != 0 && strcmp (constraint, "=f") != 0)
+   /* Long double with a constraint other than "f" (or "=f" for inout
+  operands) - nothing to do.  */
+   continue;
+  gcc_assert (allows_reg);
+  gcc_assert (!allows_mem);
+  /* Copy input value from a vector register into a FPR pair.  */
+  rtx fprx2 = gen_reg_rtx (FPRX2mode);
+  emit_insn (gen_tf_to_fprx2 (fprx2, inputs[i]));
+  inputs[i] = fprx2;
+  input_modes[i] = FPRX2mode;
+}
+
+  return after_md_seq;
+}
+
 /* Initialize GCC target structure.  */
 
 #undef  TARGET_ASM_ALIGNED_HI_OP
@@ -16995,6 +17065,9 @@ s390_shift_truncation_mask (machine_mode mode)
 #undef 

Re: [C PATCH] qualifiers of pointers to arrays in C2X [PR 98397]

2021-01-18 Thread Joseph Myers
On Sun, 17 Jan 2021, Uecker, Martin wrote:

> I did not add tests for c11 for warnings because we already
> had warnings before and the tests for these exist. (I removed 
> -Wdiscarded-array-qualifiers from the old tests as this flag
> is not needed.) Or should there be additional warnings
> with -Wc11-c2x-compat for c11? But warning twice about
> the same issue does not seem ideal...

If something is already warned about by default in C11 mode, 
-Wc11-c2x-compat doesn't need to produce extra warnings.

> + int (*x)[3];
> + const int (*p)[3] = x; /* { dg-warning "pointers to arrays with 
> different qualifiers are
> incompatible in ISO C before C2X"  } */

"incompatible" doesn't seem the right wording for the diagnostic.  The 
types are incompatible (i.e. not compatible types) regardless of standard 
version; the issue in this case is the rules for assignment.

> diff --git a/gcc/testsuite/gcc.dg/c2x-qual-6.c 
> b/gcc/testsuite/gcc.dg/c2x-qual-6.c
> new file mode 100644
> index 000..dca50ac014f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/c2x-qual-6.c
> @@ -0,0 +1,9 @@
> +/* Test related to qualifiers and pointers to arrays in C2X, PR98397 */
> +/* { dg-do compile } */
> +/* { dg-options "-std=c2x -pedantic-errors -Wc11-c2x-compat" } */
> +
> +void f(void)
> +{
> + const void* x;
> + const int (*p)[3] = x; /* { dg-error "array with qualifier on the 
> element is not qualified
> before C2X" } */

This is showing a bug in the compiler implementation.  In C2X mode, this 
message should be a warning not a pedwarn because the code is not a 
constraint violation.  -Wc11-c2x-compat should produce a warning 
(independent of -pedantic), but -pedantic-errors should not turn it into 
an error.

-- 
Joseph S. Myers
jos...@codesourcery.com


[committed] IRA: patch to fix PR97847

2021-01-18 Thread Vladimir Makarov via Gcc-patches

The following patch fixes

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97847

The patch was successfully bootstrapped and tested on x86-64 and ppc64.


[PR97847] IRA: Skip abnormal critical edge splitting

PPC64 can generate jumps with clobbered pseudo-regs and a BB with
such jump can have abnormal output edges.  IRA hits an assert when trying
to split abnormal critical edge to deal with asm goto output reloads
later.  The patch just skips splitting abnormal edges.  It is assumed
that asm-goto with output reloads can not be in BB with output abnormal edges.

gcc/ChangeLog:

	PR target/97847
	* ira.c (ira): Skip abnormal critical edge splitting.

diff --git a/gcc/ira.c b/gcc/ira.c
index 725b0ff0276..f0bdbc8cf56 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -5433,12 +5433,22 @@ ira (FILE *f)
 	  for (int i = 0; i < recog_data.n_operands; i++)
 	if (recog_data.operand_type[i] != OP_IN)
 	  {
+		bool skip_p = false;
+		FOR_EACH_EDGE (e, ei, bb->succs)
+		  if (EDGE_CRITICAL_P (e)
+		  && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
+		  && (e->flags & EDGE_ABNORMAL))
+		{
+		  skip_p = true;
+		  break;
+		}
+		if (skip_p)
+		  break;
 		output_jump_reload_p = true;
 		FOR_EACH_EDGE (e, ei, bb->succs)
 		  if (EDGE_CRITICAL_P (e)
 		  && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun))
 		{
-		  ira_assert (!(e->flags & EDGE_ABNORMAL));
 		  start_sequence ();
 		  /* We need to put some no-op insn here.  We can
 			 not put a note as commit_edges insertion will


[committed] c++: Add CTAD + pack expansion testcase

2021-01-18 Thread Patrick Palka via Gcc-patches
After r11-6614 made cp_walk_subtrees walk into the template of a CTAD
placeholder, we now correctly accept the below testcase.  We used to
reject it because find_parameter_packs_r would fail to find the
parameter pack Ts inside the CTAD placeholder within the pack expansion.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/class-deduction77.C: New test.
---
 gcc/testsuite/g++.dg/cpp1z/class-deduction77.C | 6 ++
 1 file changed, 6 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction77.C

diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction77.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction77.C
new file mode 100644
index 000..c5462fb4f32
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction77.C
@@ -0,0 +1,6 @@
+// { dg-do compile { target c++17 } }
+
+template  struct A {};
+
+template  class... Ts>
+using B = A;
-- 
2.30.0.155.g66e871b664



[r11-6759 Regression] FAIL: gcc.dg/debug/dwarf2/pr41445-7.c scan-assembler DW_TAG_variable[^\\r\\n]*[\\r\\n]+[^\\r\\n]*"varj[^\\r\\n]*DW_AT_name([^\\r\\n]*[\\r\\n]+[^\\r\\n]*DW_AT_)*[^\\r\\n]*[\\r\\n]

2021-01-18 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

b654d23a470af25442e496ba62b5558e7c3ff1e6 is the first bad commit
commit b654d23a470af25442e496ba62b5558e7c3ff1e6
Author: David Edelsohn 
Date:   Sun Jan 17 18:18:56 2021 -0500

testsuite: Skip DWARF 5 testcases on AIX.

caused

FAIL: gcc.dg/debug/dwarf2/pr41445-7.c scan-assembler 
DW_TAG_variable[^\\r\\n]*[\\r\\n]+[^\\r\\n]*"vari[^\\r\\n]*DW_AT_name([^\\r\\n]*[\\r\\n]+[^\\r\\n]*DW_AT_)*[^\\r\\n]*[\\r\\n]+[^\\r\\n]*[^\\r\\n]*DW_AT_decl_line
 \\((0xa|10)\\)
FAIL: gcc.dg/debug/dwarf2/pr41445-7.c scan-assembler 
DW_TAG_variable[^\\r\\n]*[\\r\\n]+[^\\r\\n]*"varj[^\\r\\n]*DW_AT_name([^\\r\\n]*[\\r\\n]+[^\\r\\n]*DW_AT_)*[^\\r\\n]*[\\r\\n]+[^\\r\\n]*[^\\r\\n]*DW_AT_decl_line
 \\((0xa|10)\\)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-6759/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dwarf2.exp=gcc.dg/debug/dwarf2/pr41445-7.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dwarf2.exp=gcc.dg/debug/dwarf2/pr41445-7.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dwarf2.exp=gcc.dg/debug/dwarf2/pr41445-7.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dwarf2.exp=gcc.dg/debug/dwarf2/pr41445-7.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [PATCH] Hurd: Enable ifunc by default

2021-01-18 Thread Samuel Thibault via Gcc-patches
Hello,

Joseph Myers, le lun. 18 janv. 2021 20:05:44 +, a ecrit:
> On Wed, 13 Jan 2021, Thomas Schwinge wrote:
> > Thanks (and sorry for the delay), pushed "Hurd: Enable ifunc by default"
> > to master branch in commit e9cb89b936f831a02318d45fc4ddb06f7be55ae4, and
> > cherry-picked into releases/gcc-10 branch in commit
> > 92b131491c22eb4e4b663d226e9d97f1fd693063, releases/gcc-9 branch in commit
> > 0313ce139f4ca3c96db9dc82125ec9e4a167a224, releases/gcc-8 branch in commit
> > 975b0fa0f43e84bed3cb1b2b593132bc219f962c, see attached.
> 
> I'm not sure what toolchain component the underlying bug is in, but this 
> GCC commit (verified in the releases/gcc-10 case) results in a glibc build 
> failure for i686-gnu with build-many-glibcs.py.
> 
> https://sourceware.org/pipermail/libc-testresults/2021q1/007378.html
> 
> The error is:
> 
> /scratch/jmyers/glibc-bot/install/compilers/i686-gnu/lib/gcc/i686-glibc-gnu/11.0.0/../../../../i686-glibc-gnu/bin/ld:
>  
> /scratch/jmyers/glibc-bot/build/compilers/i686-gnu/glibc/i686-gnu/elf/librtld.os:
>  in function `hurd_file_name_lookup_retry':
> (.text+0x1e08e): undefined reference to `strncpy'

Ah, I believe I had tested that (which is precisely why I asked for that
commit to be done in gcc), but I'll have a look, thanks.

Samuel


Re: [PATCH] libbacktrace: Fix up DWARF5 .debug_line handling [PR98716]

2021-01-18 Thread Jakub Jelinek via Gcc-patches
On Mon, Jan 18, 2021 at 12:11:02PM -0800, Ian Lance Taylor wrote:
> Thanks, but I think more changes are needed.  Looking up entries in
> the filenames array will be off by one for DWARF 5 in other places as
> well, specifically when handling DW_LNS_set_file and DW_AT_call_file.
> I'm testing a larger patch now.

You're right, sorry for missing that spot.

Jakub



Re: [PATCH] libbacktrace: Fix up DWARF5 .debug_line handling [PR98716]

2021-01-18 Thread Ian Lance Taylor via Gcc-patches
On Mon, Jan 18, 2021 at 10:44 AM Jakub Jelinek via Gcc-patches
 wrote:
>
> When GCC (since the switch to -gdwarf-5 by default) is configured and built
> against recent binutils (2.35.1 if slightly patched, or 2.36 or trunk),
> the assembler emits DWARF5 .debug_line rather than DWARF4 or older
> .debug_line.
>
> Seems some DWARF5 support has been added to libbacktrace, but there is one
> problem.  The DWARF5 spec (like DWARF4 spec) says that the initial value of
> file is 1, but unlike DWARF4 and earlier which had in the filename table
> entries starting with 1, DWARF5 has an 0 entry before that (which is
> supposed to match DW_AT_name and DW_AT_comps_dir pair in the .debug_info
> CU).
>
> The following patch fixes that.
>
> On i686-linux when built against those new binutils this fixes (the
> c-c++-common tests for both C and C++):
> -FAIL: c-c++-common/asan/alloca_big_alignment.c   -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none  output pattern test
> -FAIL: c-c++-common/asan/alloca_big_alignment.c   -O2 -flto 
> -fuse-linker-plugin -fno-fat-lto-objects  output pattern test
> -FAIL: c-c++-common/asan/alloca_detect_custom_size.c   -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none  output pattern test
> -FAIL: c-c++-common/asan/alloca_detect_custom_size.c   -O2 -flto 
> -fuse-linker-plugin -fno-fat-lto-objects  output pattern test
> -FAIL: c-c++-common/asan/alloca_overflow_partial.c   -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none  output pattern test
> -FAIL: c-c++-common/asan/alloca_overflow_partial.c   -O2 -flto 
> -fuse-linker-plugin -fno-fat-lto-objects  output pattern test
> -FAIL: c-c++-common/asan/alloca_overflow_right.c   -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none  output pattern test
> -FAIL: c-c++-common/asan/alloca_overflow_right.c   -O2 -flto 
> -fuse-linker-plugin -fno-fat-lto-objects  output pattern test
> -FAIL: c-c++-common/asan/alloca_underflow_left.c   -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none  output pattern test
> -FAIL: c-c++-common/asan/alloca_underflow_left.c   -O2 -flto 
> -fuse-linker-plugin -fno-fat-lto-objects  output pattern test
> -FAIL: c-c++-common/asan/global-overflow-1.c   -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none  output pattern test
> -FAIL: c-c++-common/asan/global-overflow-1.c   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  output pattern test
> -FAIL: c-c++-common/asan/heap-overflow-1.c   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none  output pattern test
> -FAIL: c-c++-common/asan/heap-overflow-1.c   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  output pattern test
> -FAIL: c-c++-common/asan/null-deref-1.c   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none  output pattern test
> -FAIL: c-c++-common/asan/null-deref-1.c   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  output pattern test
> -FAIL: c-c++-common/asan/sanity-check-pure-c-1.c   -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none  output pattern test
> -FAIL: c-c++-common/asan/sanity-check-pure-c-1.c   -O2 -flto 
> -fuse-linker-plugin -fno-fat-lto-objects  output pattern test
> -FAIL: c-c++-common/asan/stack-overflow-1.c   -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none  output pattern test
> -FAIL: c-c++-common/asan/stack-overflow-1.c   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  output pattern test
> -FAIL: c-c++-common/asan/strncpy-overflow-1.c   -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none  output pattern test
> -FAIL: c-c++-common/asan/strncpy-overflow-1.c   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  output pattern test
> -FAIL: c-c++-common/asan/use-after-free-1.c   -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none  output pattern test
> -FAIL: c-c++-common/asan/use-after-free-1.c   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  output pattern test
> -FAIL: g++.dg/asan/large-func-test-1.C   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none  output pattern test
> -FAIL: g++.dg/asan/large-func-test-1.C   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  output pattern test
> -FAIL: TestCaller
>
> Bootstrapped/regtested on x86_64-linux and i686-linux (2.35.1 binutils but
> not patched) and i686-linux (latest binutils), x86_64-linux regtest still
> pending, ok for trunk?
>
> 2021-01-18  Jakub Jelinek  
>
> PR debug/98716
> * dwarf.c (read_line_program): For DWARF5 .debug_line headers,
> start with hdr->filenames[1] rather than hdr->filenames[0].
>
> --- libbacktrace/dwarf.c.jj 2021-01-04 10:25:53.495067802 +0100
> +++ libbacktrace/dwarf.c2021-01-18 14:27:05.034589998 +0100
> @@ -2715,8 +2715,11 @@ read_line_program (struct backtrace_stat
>
>address = 0;
>op_index = 0;
> -  if (hdr->filenames_count > 0)
> -reset_filename = hdr->filenames[0];
> +  /* The initial file is file with index 1.  In DWARF4 and earlier
> + filename table starts with entry 1, while in DWARF5 it 

Re: [PATCH] Hurd: Enable ifunc by default

2021-01-18 Thread Joseph Myers
On Wed, 13 Jan 2021, Thomas Schwinge wrote:

> Hi!
> 
> Thanks (and sorry for the delay), pushed "Hurd: Enable ifunc by default"
> to master branch in commit e9cb89b936f831a02318d45fc4ddb06f7be55ae4, and
> cherry-picked into releases/gcc-10 branch in commit
> 92b131491c22eb4e4b663d226e9d97f1fd693063, releases/gcc-9 branch in commit
> 0313ce139f4ca3c96db9dc82125ec9e4a167a224, releases/gcc-8 branch in commit
> 975b0fa0f43e84bed3cb1b2b593132bc219f962c, see attached.

I'm not sure what toolchain component the underlying bug is in, but this 
GCC commit (verified in the releases/gcc-10 case) results in a glibc build 
failure for i686-gnu with build-many-glibcs.py.

https://sourceware.org/pipermail/libc-testresults/2021q1/007378.html

The error is:

/scratch/jmyers/glibc-bot/install/compilers/i686-gnu/lib/gcc/i686-glibc-gnu/11.0.0/../../../../i686-glibc-gnu/bin/ld:
 
/scratch/jmyers/glibc-bot/build/compilers/i686-gnu/glibc/i686-gnu/elf/librtld.os:
 in function `hurd_file_name_lookup_retry':
(.text+0x1e08e): undefined reference to `strncpy'
collect2: error: ld returned 1 exit status
Makefile:584: recipe for target 
'/scratch/jmyers/glibc-bot/build/compilers/i686-gnu/glibc/i686-gnu/elf/ld.so' 
failed
make[3]: *** 
[/scratch/jmyers/glibc-bot/build/compilers/i686-gnu/glibc/i686-gnu/elf/ld.so] 
Error 1

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] ipa-sra: Do not remove return values needed because of non-call EH (PR 98690)

2021-01-18 Thread Martin Jambor
Hi, 

IPA-SRA already contains a check to figure out that an otherwise dead
parameter is actually required because of non-call exceptions, but it
is not present at the equivalent spot where SRA figures out whether
the return statement is used for anything useful.  This patch adds
that condition there.

Unfortunately, even though this patch should be good enough for any
normal (I'd even say reasonable) use of the compiler, it hints that
when the user manually switches all sorts of DCE, IPA-SRA would
probably leave behind problematic statements manipulating what
originally were return values, just like it does for parameters (PR
93385).  Fixing this properly might unfortunately be a separate issue
from the mentioned bug because the LHS of a call is changed during
call redirection and the caller often is not a clone.  But I'll see
what I can do.

Meanwhile, the patch below has been bootstrapped and tested on x86_64.
OK for trunk and then for the gcc-10 branch?

Thanks,

Martin


gcc/ChangeLog:

2021-01-18  Martin Jambor  

PR ipa/98690
* ipa-sra.c (ssa_name_only_returned_p): New parameter fun.  Check
whether non-call exceptions allow removal of a statement.
(isra_analyze_call): Pass the appropriate function to
ssa_name_only_returned_p.

gcc/testsuite/ChangeLog:

2021-01-18  Martin Jambor  

PR ipa/98690
* g++.dg/ipa/pr98690.C: New test.
---
 gcc/ipa-sra.c  | 20 +++-
 gcc/testsuite/g++.dg/ipa/pr98690.C | 27 +++
 2 files changed, 38 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ipa/pr98690.C

diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
index 5d2c0dfce53..1571921cb48 100644
--- a/gcc/ipa-sra.c
+++ b/gcc/ipa-sra.c
@@ -1952,13 +1952,13 @@ scan_function (cgraph_node *node, struct function *fun)
 }
 }
 
-/* Return true if SSA_NAME NAME is only used in return statements, or if
-   results of any operations it is involved in are only used in return
-   statements.  ANALYZED is a bitmap that tracks which SSA names we have
-   already started investigating.  */
+/* Return true if SSA_NAME NAME of function described by FUN is only used in
+   return statements, or if results of any operations it is involved in are
+   only used in return statements.  ANALYZED is a bitmap that tracks which SSA
+   names we have already started investigating.  */
 
 static bool
-ssa_name_only_returned_p (tree name, bitmap analyzed)
+ssa_name_only_returned_p (function *fun, tree name, bitmap analyzed)
 {
   bool res = true;
   imm_use_iterator imm_iter;
@@ -1978,8 +1978,9 @@ ssa_name_only_returned_p (tree name, bitmap analyzed)
  break;
}
}
-  else if ((is_gimple_assign (stmt) && !gimple_has_volatile_ops (stmt))
-  || gimple_code (stmt) == GIMPLE_PHI)
+  else if (!stmt_unremovable_because_of_non_call_eh_p (fun, stmt)
+  && ((is_gimple_assign (stmt) && !gimple_has_volatile_ops (stmt))
+  || gimple_code (stmt) == GIMPLE_PHI))
{
  /* TODO: And perhaps for const function calls too?  */
  tree lhs;
@@ -1995,7 +1996,7 @@ ssa_name_only_returned_p (tree name, bitmap analyzed)
}
  gcc_assert (!gimple_vdef (stmt));
  if (bitmap_set_bit (analyzed, SSA_NAME_VERSION (lhs))
- && !ssa_name_only_returned_p (lhs, analyzed))
+ && !ssa_name_only_returned_p (fun, lhs, analyzed))
{
  res = false;
  break;
@@ -2049,7 +2050,8 @@ isra_analyze_call (cgraph_edge *cs)
   if (TREE_CODE (lhs) == SSA_NAME)
{
  bitmap analyzed = BITMAP_ALLOC (NULL);
- if (ssa_name_only_returned_p (lhs, analyzed))
+ if (ssa_name_only_returned_p (DECL_STRUCT_FUNCTION (cs->caller->decl),
+   lhs, analyzed))
csum->m_return_returned = true;
  BITMAP_FREE (analyzed);
}
diff --git a/gcc/testsuite/g++.dg/ipa/pr98690.C 
b/gcc/testsuite/g++.dg/ipa/pr98690.C
new file mode 100644
index 000..004418e5b40
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr98690.C
@@ -0,0 +1,27 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fnon-call-exceptions" } */
+
+int g;
+volatile int v;
+
+static int * __attribute__((noinline))
+almost_useless_return (void)
+{
+  v = 1;
+  return 
+}
+
+static void __attribute__((noinline))
+foo (void)
+{
+  int *p = almost_useless_return ();
+  int i = *p;
+  v = 2;
+}
+
+int
+main (int argc, char *argv[])
+{
+  foo ();
+  return 0;
+}
-- 
2.29.2



Re: [PATCH] Modula-2 into the GCC tree on master

2021-01-18 Thread Gaius Mulley via Gcc-patches
Matthias Klose  writes:

> this is mising the definition of lang_register_spec_functions for the jit 
> build.
>
> 2020-03-23  Matthias Klose  
>
> * jit-spec.c (lang_register_spec_functions): New, not used for jit.
>
>
> --- a/gcc/jit/jit-spec.c
> +++ b/gcc/jit/jit-spec.c
> @@ -39,3 +39,9 @@ lang_specific_pre_link (void)
>
>  /* Number of extra output files that lang_specific_pre_link may generate.  */
>  int lang_specific_extra_outfiles = 0;  /* Not used for jit.  */
> +
> +/* lang_register_spec_functions.  Not used for jit.  */
> +void
> +lang_register_spec_functions (void)
> +{
> +}

Hello Matthias,

thanks for spotting this - yes - quite correct,



Re: [PATCH] aarch64: reimplement vqmovn_high* intrinsics using builtins

2021-01-18 Thread Richard Sandiford via Gcc-patches
Kyrylo Tkachov via Gcc-patches  writes:
> diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def 
> b/gcc/config/aarch64/aarch64-simd-builtins.def
> index 
> 6efc7706a41e02d947753a4cda984159b68bd39f..27e9026d9e8b7ff980c5b8d9ff1b00490e3a18cb
>  100644
> --- a/gcc/config/aarch64/aarch64-simd-builtins.def
> +++ b/gcc/config/aarch64/aarch64-simd-builtins.def
> @@ -175,6 +175,11 @@
>/* Implemented by aarch64_qmovn.  */
>BUILTIN_VSQN_HSDI (UNOP, sqmovn, 0, ALL)
>BUILTIN_VSQN_HSDI (UNOP, uqmovn, 0, ALL)
> +
> +  /* Implemented by aarch64_qxtn2.  */
> +  BUILTIN_VQN (BINOP, sqxtn2, 0, ALL)
> +  BUILTIN_VQN (BINOPU, uqxtn2, 0, ALL)

FWIW, I think these can be NONE instead (i.e. treated as const functions).
The consensus seemed to be that the side-effect on the Q flag isn't
observable.

(Converting the existing intrinsincs is still WIP.)

Thanks,
Richard


[PATCH] libbacktrace: Fix up DWARF5 .debug_line handling [PR98716]

2021-01-18 Thread Jakub Jelinek via Gcc-patches
Hi!

When GCC (since the switch to -gdwarf-5 by default) is configured and built
against recent binutils (2.35.1 if slightly patched, or 2.36 or trunk),
the assembler emits DWARF5 .debug_line rather than DWARF4 or older
.debug_line.

Seems some DWARF5 support has been added to libbacktrace, but there is one
problem.  The DWARF5 spec (like DWARF4 spec) says that the initial value of
file is 1, but unlike DWARF4 and earlier which had in the filename table
entries starting with 1, DWARF5 has an 0 entry before that (which is
supposed to match DW_AT_name and DW_AT_comps_dir pair in the .debug_info
CU).

The following patch fixes that.

On i686-linux when built against those new binutils this fixes (the
c-c++-common tests for both C and C++):
-FAIL: c-c++-common/asan/alloca_big_alignment.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  output pattern test
-FAIL: c-c++-common/asan/alloca_big_alignment.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  output pattern test
-FAIL: c-c++-common/asan/alloca_detect_custom_size.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  output pattern test
-FAIL: c-c++-common/asan/alloca_detect_custom_size.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  output pattern test
-FAIL: c-c++-common/asan/alloca_overflow_partial.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  output pattern test
-FAIL: c-c++-common/asan/alloca_overflow_partial.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  output pattern test
-FAIL: c-c++-common/asan/alloca_overflow_right.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  output pattern test
-FAIL: c-c++-common/asan/alloca_overflow_right.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  output pattern test
-FAIL: c-c++-common/asan/alloca_underflow_left.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  output pattern test
-FAIL: c-c++-common/asan/alloca_underflow_left.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  output pattern test
-FAIL: c-c++-common/asan/global-overflow-1.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  output pattern test
-FAIL: c-c++-common/asan/global-overflow-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  output pattern test
-FAIL: c-c++-common/asan/heap-overflow-1.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  output pattern test
-FAIL: c-c++-common/asan/heap-overflow-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  output pattern test
-FAIL: c-c++-common/asan/null-deref-1.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  output pattern test
-FAIL: c-c++-common/asan/null-deref-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  output pattern test
-FAIL: c-c++-common/asan/sanity-check-pure-c-1.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  output pattern test
-FAIL: c-c++-common/asan/sanity-check-pure-c-1.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  output pattern test
-FAIL: c-c++-common/asan/stack-overflow-1.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  output pattern test
-FAIL: c-c++-common/asan/stack-overflow-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  output pattern test
-FAIL: c-c++-common/asan/strncpy-overflow-1.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  output pattern test
-FAIL: c-c++-common/asan/strncpy-overflow-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  output pattern test
-FAIL: c-c++-common/asan/use-after-free-1.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  output pattern test
-FAIL: c-c++-common/asan/use-after-free-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  output pattern test
-FAIL: g++.dg/asan/large-func-test-1.C   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  output pattern test
-FAIL: g++.dg/asan/large-func-test-1.C   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  output pattern test
-FAIL: TestCaller

Bootstrapped/regtested on x86_64-linux and i686-linux (2.35.1 binutils but
not patched) and i686-linux (latest binutils), x86_64-linux regtest still
pending, ok for trunk?

2021-01-18  Jakub Jelinek  

PR debug/98716
* dwarf.c (read_line_program): For DWARF5 .debug_line headers,
start with hdr->filenames[1] rather than hdr->filenames[0].

--- libbacktrace/dwarf.c.jj 2021-01-04 10:25:53.495067802 +0100
+++ libbacktrace/dwarf.c2021-01-18 14:27:05.034589998 +0100
@@ -2715,8 +2715,11 @@ read_line_program (struct backtrace_stat
 
   address = 0;
   op_index = 0;
-  if (hdr->filenames_count > 0)
-reset_filename = hdr->filenames[0];
+  /* The initial file is file with index 1.  In DWARF4 and earlier
+ filename table starts with entry 1, while in DWARF5 it starts
+ with entry 0 which should match the CU's DW_AT_name attribute.  */
+  if (hdr->filenames_count > (hdr->version >= 5))
+reset_filename = hdr->filenames[hdr->version >= 5];
   else
 reset_filename = "";
   filename = 

[committed] widening_mul: Fix up signed multiplication overflow check handling [PR98727]

2021-01-18 Thread Jakub Jelinek via Gcc-patches
Hi!

I forgot one line, which means that if the second operand of the multiplication
isn't constant, it would be just the same as the first one.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
committed as obvious to trunk.

2021-01-18  Jakub Jelinek  

PR tree-optimization/98727
* tree-ssa-math-opts.c (match_arith_overflow): Fix up computation of
second .MUL_OVERFLOW operand for signed multiplication with overflow
checking if the second operand of multiplication is not constant.

* gcc.c-torture/execute/pr98727.c: New test.

--- gcc/tree-ssa-math-opts.c.jj 2021-01-12 11:04:39.0 +0100
+++ gcc/tree-ssa-math-opts.c2021-01-18 13:36:38.707210300 +0100
@@ -4170,6 +4170,7 @@ match_arith_overflow (gimple_stmt_iterat
rhs2 = fold_convert (type, rhs2);
   else
{
+ g = SSA_NAME_DEF_STMT (rhs2);
  if (gimple_assign_cast_p (g)
  && useless_type_conversion_p (type,
TREE_TYPE (gimple_assign_rhs1 (g)))
--- gcc/testsuite/gcc.c-torture/execute/pr98727.c.jj2021-01-18 
13:47:00.227192663 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr98727.c   2021-01-18 
13:46:29.511539475 +0100
@@ -0,0 +1,20 @@
+/* PR tree-optimization/98727 */
+
+__attribute__((noipa)) long int
+foo (long int x, long int y)
+{
+  long int z = (unsigned long) x * y;
+  if (x != z / y)
+return -1;
+  return z;
+}
+
+int
+main ()
+{
+  if (foo (4, 24) != 96
+  || foo (124, 126) != 124L * 126
+  || foo (__LONG_MAX__ / 16, 17) != -1)
+__builtin_abort ();
+  return 0;
+}


Jakub



Re: [PATCH] aix: Default to DWARF 4

2021-01-18 Thread David Edelsohn via Gcc-patches
On Mon, Jan 18, 2021 at 11:49 AM Jakub Jelinek  wrote:
>
> On Mon, Jan 18, 2021 at 11:31:37AM -0500, David Edelsohn via Gcc-patches 
> wrote:
> > On Mon, Jan 18, 2021 at 6:01 AM Mark Wielaard  wrote:
> > >
> > > Hi David,
> > >
> > > On Sun, Jan 17, 2021 at 06:12:06PM -0500, David Edelsohn wrote:
> > > > GCC now defaults to DWARF 5.  AIX only supports DWARF 4 (3.5).
> > > >
> > > > This patch overrides the default DWARF version to 4 unless 
> > > > explicitly
> > > > stated.
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > * config/rs6000/aix71.h (SUBTARGET_OVERRIDE_OPTIONS): 
> > > > Override
> > > > dwarf_version to 4.
> > > > * config/rs6000/aix72.h (SUBTARGET_OVERRIDE_OPTIONS): Same.
> > >
> > > Thanks, I hadn't tested against AIX.  Could you also update
> > > gcc/doc/invoke.texi (-gdwarf) with the defaults for AIX?
> >
> > Other targets override the default DWARF level and I don't see it 
> > documented.
>
> We certainly document that Darwin and VxWorks default to dwarf_version 2.
> Which other targets do you have in mind?

s390 TPF.

- David


[Patch] OpenMP/Fortran: Fixes for {use,is}_device_ptr [PR98476]

2021-01-18 Thread Tobias Burnus

use_device_ptr and is_device_ptr are underspecified for Fortran in OpenMP <= 5.0
and, hence, also not properly implemented in GCC.

OpenMP 5.1 cleaned this up by mapping (for Fortran) use_device_ptr to
the existing and well-defined use_device_addr (except for type(c_ptr)) and
is_device_ptr (with the same exception) to the new has_device_addr.

The attached testcase gave a ME ICE for 'use_device_ptr(cc,dd)' –
which was be fixed by applying the new OpenMP 5.1 mapping to
'use_device_addr'. → gcc/fortran/openmp.c.

is_device_ptr(aa) is from real-world code (see PR) and could be fixed by
adding two 'pointer_type_p(type)' checks. → omp-low.c

While testing, it turned out that 'is_device_ptr(aa,bb,cc,dd) was accepted
as only the first list item was checked – giving later an ICE in the ME.
(Otherwise deferred to OpenMP 5.1's has_device_addr.)

OK?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
OpenMP/Fortran: Fixes for {use,is}_device_ptr

gcc/fortran/ChangeLog:

	PR fortran/98476
	* openmp.c (resolve_omp_clauses): Change use_device_ptr
	to use_device_addr for unless type(c_ptr); check all
	list item for is_device_ptr.

gcc/ChangeLog:

	PR fortran/98476
	* omp-low.c (lower_omp_target): Handle nonpointer is_device_ptr.

libgomp/ChangeLog:

	PR fortran/98476
	* testsuite/libgomp.fortran/is_device_ptr-1.f90: New test.

gcc/testsuite/ChangeLog:

	PR fortran/98476
	* gfortran.dg/gomp/map-3.f90: Update expected scan-dump-tree.
	* gfortran.dg/gomp/is_device_ptr-1.f90: New test.
	* gfortran.dg/gomp/use_device_ptr-2.f90: New test.

 gcc/fortran/openmp.c   | 67 --
 gcc/omp-low.c  |  6 +-
 gcc/testsuite/gfortran.dg/gomp/is_device_ptr-2.f90 | 21 +++
 gcc/testsuite/gfortran.dg/gomp/map-3.f90   | 10 ++--
 .../gfortran.dg/gomp/use_device_ptr-1.f90  | 25 
 .../testsuite/libgomp.fortran/is_device_ptr-1.f90  | 54 +
 6 files changed, 160 insertions(+), 23 deletions(-)

diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index a9ecd96cb35..9a3a8f63b5e 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -5345,22 +5345,25 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
 		}
 	break;
 	  case OMP_LIST_IS_DEVICE_PTR:
-	if (!n->sym->attr.dummy)
-	  gfc_error ("Non-dummy object %qs in %s clause at %L",
-			 n->sym->name, name, >where);
-	if (n->sym->attr.allocatable
-		|| (n->sym->ts.type == BT_CLASS
-		&& CLASS_DATA (n->sym)->attr.allocatable))
-	  gfc_error ("ALLOCATABLE object %qs in %s clause at %L",
-			 n->sym->name, name, >where);
-	if (n->sym->attr.pointer
-		|| (n->sym->ts.type == BT_CLASS
-		&& CLASS_DATA (n->sym)->attr.pointer))
-	  gfc_error ("POINTER object %qs in %s clause at %L",
-			 n->sym->name, name, >where);
-	if (n->sym->attr.value)
-	  gfc_error ("VALUE object %qs in %s clause at %L",
-			 n->sym->name, name, >where);
+	for (n = omp_clauses->lists[list]; n != NULL; n = n->next)
+	  {
+		if (!n->sym->attr.dummy)
+		  gfc_error ("Non-dummy object %qs in %s clause at %L",
+			 n->sym->name, name, >where);
+		if (n->sym->attr.allocatable
+		|| (n->sym->ts.type == BT_CLASS
+			&& CLASS_DATA (n->sym)->attr.allocatable))
+		  gfc_error ("ALLOCATABLE object %qs in %s clause at %L",
+			 n->sym->name, name, >where);
+		if (n->sym->attr.pointer
+		|| (n->sym->ts.type == BT_CLASS
+			&& CLASS_DATA (n->sym)->attr.pointer))
+		  gfc_error ("POINTER object %qs in %s clause at %L",
+			 n->sym->name, name, >where);
+		if (n->sym->attr.value)
+		  gfc_error ("VALUE object %qs in %s clause at %L",
+			 n->sym->name, name, >where);
+	  }
 	break;
 	  case OMP_LIST_USE_DEVICE_PTR:
 	  case OMP_LIST_USE_DEVICE_ADDR:
@@ -5657,6 +5660,38 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
 	break;
 	  }
   }
+  /* OpenMP 5.1: use_device_ptr acts like use_device_addr, except for
+ type(c_ptr).  */
+  if (omp_clauses->lists[OMP_LIST_USE_DEVICE_PTR])
+{
+  gfc_omp_namelist *n_prev, *n_next, *n_addr;
+  n_addr = omp_clauses->lists[OMP_LIST_USE_DEVICE_ADDR];
+  for (; n_addr && n_addr->next; n_addr = n_addr->next)
+	;
+  n_prev = NULL;
+  n = omp_clauses->lists[OMP_LIST_USE_DEVICE_PTR];
+  while (n)
+	{
+	  n_next = n->next;
+	  if (n->sym->ts.type != BT_DERIVED
+	  || n->sym->ts.u.derived->ts.f90_type != BT_VOID)
+	{
+	  n->next = NULL;
+	  if (n_addr)
+		n_addr->next = n;
+	  else
+		omp_clauses->lists[OMP_LIST_USE_DEVICE_ADDR] = n;
+	  n_addr = n;
+	  if (n_prev)
+		n_prev->next = n_next;
+	  else
+		omp_clauses->lists[OMP_LIST_USE_DEVICE_PTR] = n_next;
+	}
+	  else
+	n_prev = n;
+	  n = n_next;
+	}
+}
   if (omp_clauses->safelen_expr)
 

Re: [PATCH] aix: Default to DWARF 4

2021-01-18 Thread Mark Wielaard
Hi David,

On Mon, 2021-01-18 at 11:31 -0500, David Edelsohn wrote:
> On Mon, Jan 18, 2021 at 6:01 AM Mark Wielaard  wrote:
> > Thanks, I hadn't tested against AIX.  Could you also update
> > gcc/doc/invoke.texi (-gdwarf) with the defaults for AIX?
> 
> Other targets override the default DWARF level and I don't see it documented.

It currently says:

   @item -gdwarf
   @itemx -gdwarf-@var{version}
   @opindex gdwarf
   Produce debugging information in DWARF format (if that is
   supported).
   The value of @var{version} may be either 2, 3, 4 or 5; the default
   version for most targets is 5 (with the exception of VxWorks and
   Darwin/Mac OS X which default to version 2).

I don't know whether AIX really needs a special mention. But I think it
might be good to mention that it defaults to 4 instead of 5.

Cheers,

Mark


Re: [PATCH] aix: Default to DWARF 4

2021-01-18 Thread Jakub Jelinek via Gcc-patches
On Mon, Jan 18, 2021 at 11:31:37AM -0500, David Edelsohn via Gcc-patches wrote:
> On Mon, Jan 18, 2021 at 6:01 AM Mark Wielaard  wrote:
> >
> > Hi David,
> >
> > On Sun, Jan 17, 2021 at 06:12:06PM -0500, David Edelsohn wrote:
> > > GCC now defaults to DWARF 5.  AIX only supports DWARF 4 (3.5).
> > >
> > > This patch overrides the default DWARF version to 4 unless explicitly
> > > stated.
> > >
> > > gcc/ChangeLog:
> > >
> > > * config/rs6000/aix71.h (SUBTARGET_OVERRIDE_OPTIONS): Override
> > > dwarf_version to 4.
> > > * config/rs6000/aix72.h (SUBTARGET_OVERRIDE_OPTIONS): Same.
> >
> > Thanks, I hadn't tested against AIX.  Could you also update
> > gcc/doc/invoke.texi (-gdwarf) with the defaults for AIX?
> 
> Other targets override the default DWARF level and I don't see it documented.

We certainly document that Darwin and VxWorks default to dwarf_version 2.
Which other targets do you have in mind?

Jakub



Re: [PATCH] aix: Default to DWARF 4

2021-01-18 Thread David Edelsohn via Gcc-patches
On Mon, Jan 18, 2021 at 6:01 AM Mark Wielaard  wrote:
>
> Hi David,
>
> On Sun, Jan 17, 2021 at 06:12:06PM -0500, David Edelsohn wrote:
> > GCC now defaults to DWARF 5.  AIX only supports DWARF 4 (3.5).
> >
> > This patch overrides the default DWARF version to 4 unless explicitly
> > stated.
> >
> > gcc/ChangeLog:
> >
> > * config/rs6000/aix71.h (SUBTARGET_OVERRIDE_OPTIONS): Override
> > dwarf_version to 4.
> > * config/rs6000/aix72.h (SUBTARGET_OVERRIDE_OPTIONS): Same.
>
> Thanks, I hadn't tested against AIX.  Could you also update
> gcc/doc/invoke.texi (-gdwarf) with the defaults for AIX?

Other targets override the default DWARF level and I don't see it documented.

Thanks, David


Re: The performance data for two different implementation of new security feature -ftrivial-auto-var-init

2021-01-18 Thread Qing Zhao via Gcc-patches



> On Jan 18, 2021, at 7:09 AM, Richard Sandiford  
> wrote:
> 
> Qing Zhao  writes:
> D will keep all initialized aggregates as aggregates and live which
> means stack will be allocated for it.  With A the usual optimizations
> to reduce stack usage can be applied.
 
 I checked the routine “poverties::bump_map” in 511.povray_r since it
 has a lot stack increase 
 due to implementation D, by examine the IR immediate before RTL
 expansion phase.  
 (image.cpp.244t.optimized), I found that we have the following
 additional statements for the array elements:
 
 void  pov::bump_map (double * EPoint, struct TNORMAL * Tnormal, double
 * normal)
 {
 …
 double p3[3];
 double p2[3];
 double p1[3];
 float colour3[5];
 float colour2[5];
 float colour1[5];
 …
 # DEBUG BEGIN_STMT
 colour1 = .DEFERRED_INIT (colour1, 2);
 colour2 = .DEFERRED_INIT (colour2, 2);
 colour3 = .DEFERRED_INIT (colour3, 2);
 # DEBUG BEGIN_STMT
 MEM  [(double[3] *)] = p1$0_144(D);
 MEM  [(double[3] *) + 8B] = p1$1_135(D);
 MEM  [(double[3] *) + 16B] = p1$2_138(D);
 p1 = .DEFERRED_INIT (p1, 2);
 # DEBUG D#12 => MEM  [(double[3] *)]
 # DEBUG p1$0 => D#12
 # DEBUG D#11 => MEM  [(double[3] *) + 8B]
 # DEBUG p1$1 => D#11
 # DEBUG D#10 => MEM  [(double[3] *) + 16B]
 # DEBUG p1$2 => D#10
 MEM  [(double[3] *)] = p2$0_109(D);
 MEM  [(double[3] *) + 8B] = p2$1_111(D);
 MEM  [(double[3] *) + 16B] = p2$2_254(D);
 p2 = .DEFERRED_INIT (p2, 2);
 # DEBUG D#9 => MEM  [(double[3] *)]
 # DEBUG p2$0 => D#9
 # DEBUG D#8 => MEM  [(double[3] *) + 8B]
 # DEBUG p2$1 => D#8
 # DEBUG D#7 => MEM  [(double[3] *) + 16B]
 # DEBUG p2$2 => D#7
 MEM  [(double[3] *)] = p3$0_256(D);
 MEM  [(double[3] *) + 8B] = p3$1_258(D);
 MEM  [(double[3] *) + 16B] = p3$2_260(D);
 p3 = .DEFERRED_INIT (p3, 2);
 ….
 }
 
 I guess that the above “MEM ….. = …” are the ones that make the
 differences. Which phase introduced them?
>>> 
>>> Looks like SRA. But you can just dump all and grep for the first 
>>> occurrence. 
>> 
>> Yes, looks like that SRA is the one:
>> 
>> image.cpp.035t.esra:  MEM  [(double[3] *)] = p1$0_195(D);
>> image.cpp.035t.esra:  MEM  [(double[3] *) + 8B] = p1$1_182(D);
>> image.cpp.035t.esra:  MEM  [(double[3] *) + 16B] = p1$2_185(D);
> 
> I realise no-one was suggesting otherwise, but FWIW: SRA could easily
> be extended to handle .DEFERRED_INIT if that's the main source of
> excess stack usage.  A single .DEFERRED_INIT of an aggregate can
> be split into .DEFERRED_INITs of individual components.

Thanks a lot for the suggestion,
I will study the code of SRA to see how to do this and then see whether this 
can resolve the issue.
> 
> In other words, the investigation you're doing looks like the right way
> of deciding which passes are worth extending to handle .DEFERRED_INIT.
Yes, with the study so far, looks like the major issue with the .DERERRED_INIT 
approach is the stack size increase.
Hopefully after resolving this issue, we will be done.

Qing

> 
> Thanks,
> Richard



[PATCH 2/n] AVR CC0 conversion - adjust peepholes

2021-01-18 Thread Senthil Kumar Selvaraj via Gcc-patches
Hi,

This patch, to be applied on top of
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563638.html,
adjusts peepholes to match and generate parallels with a clobber of
REG_CC.

It also sets mov_insn as the name of the pattern for the split
insn (rather than the define_insn_and_split), so that
avr_2word_insn_p, which looks for CODE_FOR_mov_insn, works
correctly. This is required for the *cpse.eq peephole to fire, and
also helps generate better code for avr_out_sbxx_branch.

There are no regressions, and the number of test cases reporting
UNSUPPORTED or FAIL because of code size changes (when compared to
mainline) for attiny40 and atmega8 are now down to 3 and 3,
respectively, from 10 and 25 previously.

The embench-iot numbers also show a good improvement.

Benchmark   Baseline  Current   Increase %
- ---   --
aha-mont64  6,944   6,944   0
crc32 704 706   0.28
cubic   9,428   9,428   0
edn 3,854   3,854   0
huffbench   2,890   2,890   0
matmult-int 1,164   1,164   0
minver  3,960   3,956  -0.1
nbody   3,106   3,110   0.13
nettle-aes  5,292   5,304   0.23
nettle-sha256  25,748  25,748   0
nsichneu   39,622  39,622   0
picojpeg9,898   9,980   0.83
qrduino 9,234   9,356   1.32
sglib-combined  4,658   4,658   0
slre4,000   4,000   0
st  3,356   3,356   0
statemate   5,490   5,502   0.22
ud  2,940   2,940   0
wikisort   20,776  20,772   -0.02

Regards
Senthil


gcc/ChangeLog:

* config/avr/avr.md: Adjust peepholes to match and
generate parallels with clobber of REG_CC.
(mov_insn): Rename to mov_insn_split.
(*mov_insn): Rename to mov_insn.


diff --git gcc/config/avr/avr.md gcc/config/avr/avr.md
index 2206fa19671..a1a325b7a8c 100644
--- gcc/config/avr/avr.md
+++ gcc/config/avr/avr.md
@@ -724,9 +724,7 @@ (define_expand "mov"
 ;; are call-saved registers, and most of LD_REGS are call-used registers,
 ;; so this may still be a win for registers live across function calls.
 
-;; "movqi_insn"
-;; "movqq_insn" "movuqq_insn"
-(define_insn_and_split "mov_insn"
+(define_insn_and_split "mov_insn_split"
   [(set (match_operand:ALL1 0 "nonimmediate_operand" "=r,d,Qm   ,r 
,q,r,*r")
 (match_operand:ALL1 1 "nox_general_operand"   "r Y00,n Ynn,r 
Y00,Qm,r,q,i"))]
   "register_operand (operands[0], mode)
@@ -737,7 +735,9 @@ (define_insn_and_split "mov_insn"
(match_dup 1))
   (clobber (reg:CC REG_CC))])])
 
-(define_insn "*mov_insn"
+;; "movqi_insn"
+;; "movqq_insn" "movuqq_insn"
+(define_insn "mov_insn"
   [(set (match_operand:ALL1 0 "nonimmediate_operand" "=r,d,Qm   ,r 
,q,r,*r")
 (match_operand:ALL1 1 "nox_general_operand"   "r Y00,n Ynn,r 
Y00,Qm,r,q,i"))
(clobber (reg:CC REG_CC))]
@@ -758,7 +758,8 @@ (define_insn "*mov_insn"
 (define_insn "*reload_in"
   [(set (match_operand:ALL1 0 "register_operand""=l")
 (match_operand:ALL1 1 "const_operand""i"))
-   (clobber (match_operand:QI 2 "register_operand" "="))]
+   (clobber (match_operand:QI 2 "register_operand" "="))
+   (clobber (reg:CC REG_CC))]
   "reload_completed"
   "ldi %2,lo8(%1)
mov %0,%2"
@@ -766,15 +767,17 @@ (define_insn "*reload_in"
 
 (define_peephole2
   [(match_scratch:QI 2 "d")
-   (set (match_operand:ALL1 0 "l_register_operand" "")
-(match_operand:ALL1 1 "const_operand" ""))]
+   (parallel [(set (match_operand:ALL1 0 "l_register_operand" "")
+   (match_operand:ALL1 1 "const_operand" ""))
+  (clobber (reg:CC REG_CC))])]
   ; No need for a clobber reg for 0x0, 0x01 or 0xff
   "!satisfies_constraint_Y00 (operands[1])
&& !satisfies_constraint_Y01 (operands[1])
&& !satisfies_constraint_Ym1 (operands[1])"
   [(parallel [(set (match_dup 0)
(match_dup 1))
-  (clobber (match_dup 2))])])
+  (clobber (match_dup 2))
+  (clobber (reg:CC REG_CC))])])
 
 ;;
 ;; move word (16 bit)
@@ -804,12 +807,14 @@ (define_insn "movhi_sp_r"
 
 (define_peephole2
   [(match_scratch:QI 2 "d")
-   (set (match_operand:ALL2 0 "l_register_operand" "")
-(match_operand:ALL2 1 "const_or_immediate_operand" ""))]
+   (parallel [(set (match_operand:ALL2 0 "l_register_operand" "")
+   (match_operand:ALL2 1 "const_or_immediate_operand" ""))
+  (clobber (reg:CC REG_CC))])]
   "operands[1] != CONST0_RTX (mode)"
   [(parallel [(set (match_dup 0)
(match_dup 1))
-  (clobber (match_dup 2))])])
+  (clobber (match_dup 2))
+  (clobber (reg:CC REG_CC))])])
 
 ;; '*' because it is not used in 

[commited] Avoid no-stack-protector-attr fails on hppa*-*-*

2021-01-18 Thread John David Anglin
The stack grows up on hppa and stack protection is not supported.

Committed to master.

Regards,
Dave

Avoid no-stack-protector-attr fails on hppa*-*-*.

gcc/testsuite/ChangeLog:

* g++.dg/no-stack-protector-attr-3.C: Don't compile on hppa*-*-*.
* g++.dg/no-stack-protector-attr.C: Likewise.

diff --git a/gcc/testsuite/g++.dg/no-stack-protector-attr-3.C 
b/gcc/testsuite/g++.dg/no-stack-protector-attr-3.C
index dd9cd4991b6..56a4e74da50 100644
--- a/gcc/testsuite/g++.dg/no-stack-protector-attr-3.C
+++ b/gcc/testsuite/g++.dg/no-stack-protector-attr-3.C
@@ -4,7 +4,7 @@
 /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
 /* { dg-options "-O2 -fstack-protector-explicit" } */

-/* { dg-do compile } */
+/* { dg-do compile { target { ! hppa*-*-* } } } */

 int __attribute__((no_stack_protector)) foo()
 {
diff --git a/gcc/testsuite/g++.dg/no-stack-protector-attr.C 
b/gcc/testsuite/g++.dg/no-stack-protector-attr.C
index e5105bf9478..3314a94bd7b 100644
--- a/gcc/testsuite/g++.dg/no-stack-protector-attr.C
+++ b/gcc/testsuite/g++.dg/no-stack-protector-attr.C
@@ -4,7 +4,7 @@
 /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
 /* { dg-options "-O2 -fstack-protector-all" } */

-/* { dg-do compile } */
+/* { dg-do compile { target { ! hppa*-*-* } } } */

 int __attribute__((no_stack_protector)) c()
 {


[committed] Skip asm goto test fails on hppa

2021-01-18 Thread John David Anglin
The hppa target is a reload target and asm goto is not supported on reload 
targets.
Skip failing tests on hppa.

Committed to master.

Regards,
Dave

Skip asm goto tests on hppa*-*-*.

gcc/testsuite/ChangeLog:

PR testsuite/97987
* gcc.c-torture/compile/asmgoto-2.c: Skip on hppa.
* gcc.c-torture/compile/asmgoto-5.c: Likewise.

diff --git a/gcc/testsuite/gcc.c-torture/compile/asmgoto-2.c 
b/gcc/testsuite/gcc.c-torture/compile/asmgoto-2.c
index f1b30c02884..d2d2ac536bd 100644
--- a/gcc/testsuite/gcc.c-torture/compile/asmgoto-2.c
+++ b/gcc/testsuite/gcc.c-torture/compile/asmgoto-2.c
@@ -1,5 +1,6 @@
 /* This test should be switched off for a new target with less than 4 
allocatable registers */
 /* { dg-do compile } */
+/* { dg-skip-if "Reload target" { hppa*-*-* } } */
 int
 foo (void)
 {
diff --git a/gcc/testsuite/gcc.c-torture/compile/asmgoto-5.c 
b/gcc/testsuite/gcc.c-torture/compile/asmgoto-5.c
index 94c14dd4005..ce751ced90c 100644
--- a/gcc/testsuite/gcc.c-torture/compile/asmgoto-5.c
+++ b/gcc/testsuite/gcc.c-torture/compile/asmgoto-5.c
@@ -1,6 +1,7 @@
 /* Test to generate output reload in asm goto on x86_64.  */
 /* { dg-do compile } */
 /* { dg-skip-if "no O0" { { i?86-*-* x86_64-*-* } && { ! ia32 } } { "-O0" } { 
"" } } */
+/* { dg-skip-if "Reload target" { hppa*-*-* } } */

 #if defined __x86_64__
 #define ASM(s) asm (s)


[PATCH] aarch64: Use GCC vector extensions for integer mls intrinsics

2021-01-18 Thread Jonathan Wright via Gcc-patches
Hi,

As subject, this patch rewrites integer mls Neon intrinsics to use
a - b * c rather than inline assembly code, allowing for better
scheduling and optimization.

Regression tested and bootstrapped on aarch64-none-linux-gnu - no
issues.

If ok, please commit to master (I don't have commit rights.)

Thanks,
Jonathan

---

gcc/Changelog:

2021-01-14  Jonathan Wright  

* config/aarch64/arm_neon.h (vmls_s8): Use C rather than asm.
(vmls_s16): Likewise.
(vmls_s32): Likewise.
(vmls_u8): Likewise.
(vmls_u16): Likewise.
(vmls_u32): Likewise.
(vmlsq_s8): Likewise.
(vmlsq_s16): Likewise.
(vmlsq_s32): Likewise.
(vmlsq_u8): Likewise.
(vmlsq_u16): Likewise.
(vmlsq_u32): Likewise.diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 608e582d25820062a409310e7f3fc872660f8041..ad04eab1e753aa86f20a8f6cc2717368b1840ef7 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -7968,72 +7968,45 @@ __extension__ extern __inline int8x8_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vmls_s8 (int8x8_t __a, int8x8_t __b, int8x8_t __c)
 {
-  int8x8_t __result;
-  __asm__ ("mls %0.8b,%2.8b,%3.8b"
-   : "=w"(__result)
-   : "0"(__a), "w"(__b), "w"(__c)
-   : /* No clobbers */);
-  return __result;
+  uint8x8_t __result = (uint8x8_t) __a - (uint8x8_t) __b * (uint8x8_t) __c;
+  return (int8x8_t) __result;
 }
 
 __extension__ extern __inline int16x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vmls_s16 (int16x4_t __a, int16x4_t __b, int16x4_t __c)
 {
-  int16x4_t __result;
-  __asm__ ("mls %0.4h,%2.4h,%3.4h"
-   : "=w"(__result)
-   : "0"(__a), "w"(__b), "w"(__c)
-   : /* No clobbers */);
-  return __result;
+  uint16x4_t __result = (uint16x4_t) __a - (uint16x4_t) __b * (uint16x4_t) __c;
+  return (int16x4_t) __result;
 }
 
 __extension__ extern __inline int32x2_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vmls_s32 (int32x2_t __a, int32x2_t __b, int32x2_t __c)
 {
-  int32x2_t __result;
-  __asm__ ("mls %0.2s,%2.2s,%3.2s"
-   : "=w"(__result)
-   : "0"(__a), "w"(__b), "w"(__c)
-   : /* No clobbers */);
-  return __result;
+  uint32x2_t __result = (uint32x2_t) __a - (uint32x2_t) __b * (uint32x2_t) __c;
+  return (int32x2_t) __result;
 }
 
 __extension__ extern __inline uint8x8_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vmls_u8 (uint8x8_t __a, uint8x8_t __b, uint8x8_t __c)
 {
-  uint8x8_t __result;
-  __asm__ ("mls %0.8b,%2.8b,%3.8b"
-   : "=w"(__result)
-   : "0"(__a), "w"(__b), "w"(__c)
-   : /* No clobbers */);
-  return __result;
+  return __a - __b * __c;
 }
 
 __extension__ extern __inline uint16x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vmls_u16 (uint16x4_t __a, uint16x4_t __b, uint16x4_t __c)
 {
-  uint16x4_t __result;
-  __asm__ ("mls %0.4h,%2.4h,%3.4h"
-   : "=w"(__result)
-   : "0"(__a), "w"(__b), "w"(__c)
-   : /* No clobbers */);
-  return __result;
+  return __a - __b * __c;
 }
 
 __extension__ extern __inline uint32x2_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vmls_u32 (uint32x2_t __a, uint32x2_t __b, uint32x2_t __c)
 {
-  uint32x2_t __result;
-  __asm__ ("mls %0.2s,%2.2s,%3.2s"
-   : "=w"(__result)
-   : "0"(__a), "w"(__b), "w"(__c)
-   : /* No clobbers */);
-  return __result;
+  return __a - __b * __c;
 }
 
 #define vmlsl_high_lane_s16(a, b, c, d) \
@@ -8565,72 +8538,45 @@ __extension__ extern __inline int8x16_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vmlsq_s8 (int8x16_t __a, int8x16_t __b, int8x16_t __c)
 {
-  int8x16_t __result;
-  __asm__ ("mls %0.16b,%2.16b,%3.16b"
-   : "=w"(__result)
-   : "0"(__a), "w"(__b), "w"(__c)
-   : /* No clobbers */);
-  return __result;
+  uint8x16_t __result = (uint8x16_t) __a - (uint8x16_t) __b * (uint8x16_t) __c;
+  return (int8x16_t) __result;
 }
 
 __extension__ extern __inline int16x8_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vmlsq_s16 (int16x8_t __a, int16x8_t __b, int16x8_t __c)
 {
-  int16x8_t __result;
-  __asm__ ("mls %0.8h,%2.8h,%3.8h"
-   : "=w"(__result)
-   : "0"(__a), "w"(__b), "w"(__c)
-   : /* No clobbers */);
-  return __result;
+  uint16x8_t __result = (uint16x8_t) __a - (uint16x8_t) __b * (uint16x8_t) __c;
+  return (int16x8_t) __result;
 }
 
 __extension__ extern __inline int32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vmlsq_s32 (int32x4_t __a, int32x4_t __b, int32x4_t __c)
 {
-  int32x4_t __result;
-  __asm__ ("mls %0.4s,%2.4s,%3.4s"
-   : "=w"(__result)
-   : "0"(__a), "w"(__b), "w"(__c)
-   : /* No clobbers */);
-  

Re: [backport gcc10, gcc9] Requet to backport PR97969

2021-01-18 Thread Vladimir Makarov via Gcc-patches



On 2021-01-18 7:50 a.m., Richard Biener wrote:

On Mon, 18 Jan 2021, Przemyslaw Wirkus wrote:


Hi all,

Can we backport PR97969 patch to GCC 10 and (maybe) GCC 9 ?:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97969

IMHO bug is severe and could land in GCC 10 and 9. Vladimir's original patch:
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563322.html
applies without changes to both gcc-10 and gcc-9.

I've regression tested this patch on both gcc-10 and gcc-9 branched for
x86_64 cross (arm-eabi target) and no issues.

OK for gcc-10 and gcc-9 ?

I see two fallout PRs with a trivial search: PR98643 and PR98722.  LRA
patches quite easily trigger unexpected fallout unfortunately ...

Yes, I am agree.  We should wait until the new regressions are fixed.  I 
am going to work on this patch more to fix the new regressions.  
Although the basic idea of the original problem solution probably will 
stay the same.

PS: I can commit if approved.





Re: [PATCH] PING implement pre-c++20 contracts

2021-01-18 Thread Jason Merrill via Gcc-patches

On 1/4/21 9:58 AM, Jeff Chapman wrote:
Ping. re: 
https://gcc.gnu.org/pipermail/gcc-patches/2020-December/561135.html 



 > OK, I'll start with -alt then, thanks.

Andrew is exactly correct, contracts-jac-alt is still the current
branch we're focusing our upstreaming efforts on.

It's trailing upstream master by a fair bit at this point. I'll get
a merge pushed shortly.


The latest is still on the same branch, which hasn't been updated since 
that last merge:
https://github.com/lock3/gcc/tree/contracts-jac-alt 



Would you prefer me to keep it from trailing upstream too much through 
regular merges, or would it be more beneficial for it to be left alone 
so you have a more stable review target?


Please let me know if there's initial feedback I can start addressing, 
or anything I can do to help the review process along in general.


Why is some of the code in c-family?  From the modules merge there is 
now a cp_handle_option function that you could add the option handling 
to, and I don't see a reason for cxx-contracts.c to be in c-family/ 
rather than cp/.


And then much of the code you add to decl.c could also move to the 
contracts file, and some of the contracts stuff in cp-tree.h could move 
to cxx-contracts.h?



+extern bool cxx23_contract_attribute_p (const_tree);


This name seems optimistic.  :)
Let's call it cxx_contract_attribute_p.


+/* Return TRUE iff ATTR has been parsed by the fornt-end as a c++2a contract


"front"


@@ -566,7 +566,11 @@ decl_attributes (tree *node, tree attributes, int flags,
{
  if (!(flags & (int) ATTR_FLAG_BUILT_IN))
{
- if (ns == NULL_TREE || !cxx11_attr_p)
+ if (cxx23_contract_attribute_p (attr))
+   {
+ ; /* Do not warn about contract "attributes".  */
+   }


I don't want the language-independent code to have to know about this. 
If you want decl_attributes to ignore these attributes, you could give 
these attributes a dummy spec that just returns?



+set_decl_contracts (tree decl, tree contract_attrs)
+{
+  remove_contract_attributes (decl);
+  if (!DECL_ATTRIBUTES (decl))
+{
+  DECL_ATTRIBUTES (decl) = contract_attrs;
+  return;
+}
+  tree last_attr = DECL_ATTRIBUTES (decl);
+  while (TREE_CHAIN (last_attr))
+last_attr = TREE_CHAIN (last_attr);
+  TREE_CHAIN (last_attr) = contract_attrs;


I think you want to use 'chainon' here.


@@ -5498,10 +5863,17 @@ start_decl (const cp_declarator *declarator,
 
   if (DECL_EXTERNAL (decl) && ! DECL_TEMPLATE_SPECIALIZATION (decl)

  /* Aliases are definitions. */
- && !alias)
+ && !alias
+ && (DECL_VIRTUAL_P (decl) || !flag_contracts))
permerror (declarator->id_loc,
   "declaration of %q#D outside of class is not definition",
   decl);
+  else if (DECL_EXTERNAL (decl) && ! DECL_TEMPLATE_SPECIALIZATION (decl)
+ /* Aliases are definitions. */
+ && !alias
+ && flag_contract_strict_declarations)
+   warning_at (declarator->id_loc, OPT_fcontract_strict_declarations_,
+   "non-defining declaration of %q#D outside of class", decl);


Let's keep the same message for the two cases.


+void
+finish_function_contracts (tree fndecl, bool is_inline)


This function needs a comment.


+/* cp_tree_defined_p helper -- returns TP if TP is undefined.  */
+
+static tree
+cp_tree_defined_p_r (tree *tp, int *, void *)
+{
+  enum tree_code code = TREE_CODE (*tp);
+  if ((code == FUNCTION_DECL || code == VAR_DECL)
+  && !decl_defined_p (*tp))
+return *tp;
+  /* We never want to accidentally instantiate templates.  */
+  if (code == TEMPLATE_DECL)
+return *tp; /* FIXME? */


In what context are you getting a TEMPLATE_DECL here?  I don't see how 
this would have an effect on instantiations.



+/* Parse a conditional-expression.  */
+/* FIXME: should callers use cp_parser_constant_expression?  */
+
+static cp_expr
+cp_parser_conditional_expression (cp_parser *parser)

...

+  /* FIXME: can we use constant_expression for this?  */
+  cp_expr cond = cp_parser_conditional_expression (parser);


I don't think we want to use cp_parser_constant_expression for 
expressions that are not intended to be constant.



+  bool finishing_guarded_p = true//!processing_template_decl


?


+/* FIXME: Is this going to leak?  */
+comment_str = xstrdup (expr_to_string (cond));


There's no need to strdup here (and free a few lines later); 
build_string_literal copies the bytes.  The return value of 
expr_to_string is in GC memory.



+  /* If we have contracts, check that they're valid in this context.  */
+  // FIXME: These aren't entirely correct.


How not?  Can local extern function decls have contract attributes?


+  if (tree pre = lookup_attribute ("pre", 

Re: [PATCH] analyzer: use "malloc" attribute

2021-01-18 Thread David Malcolm via Gcc-patches
On Mon, 2021-01-18 at 14:26 +0100, Richard Biener wrote:
> On Sun, 17 Jan 2021, David Malcolm wrote:
> 
> > This is an updated version of this patch from October:
> > 
> >   'RFC: add "deallocated_by" attribute for use by analyzer'
> > 
> > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/44.html
> > 
> > reworking it to build on top of Martin's work as noted below,
> > reusing
> > the existing attribute rather than adding a new one.
> > 
> > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> > Also tested by hand with valgrind.
> > 
> > Apart from a trivial change to attrib.h and to builtins.c, this
> > is confined to the analyzer code and docs.  The original patch was
> > posted in stage 1.  Is this OK for master?  (I'm hoping for
> > release manager permission to commit this code now; it's not
> > clear to me whether the end of stage 3 was on the 16th or is
> > today on the 17th).
> 
> I guess it's still OK if you're quick.
> 
> Richard.

Thanks; pushed as c7e276b869bdeb4a95735c1f037ee1a5f629de3d.




[committed] libstdc++: Only test writing to wostream if supported [PR 98725]

2021-01-18 Thread Jonathan Wakely via Gcc-patches
libstdc++-v3/ChangeLog:

PR libstdc++/98725
* testsuite/20_util/unique_ptr/io/lwg2948.cc:  Do not try to
write to a wide character stream if wide character support is
disabled in the library.

Tested powerpc64le-linux. Committed to trunk.

commit ec153f96f8943f1d2418d2248ed219358990bb5f
Author: Jonathan Wakely 
Date:   Mon Jan 18 14:23:13 2021

libstdc++: Only test writing to wostream if supported [PR 98725]

libstdc++-v3/ChangeLog:

PR libstdc++/98725
* testsuite/20_util/unique_ptr/io/lwg2948.cc:  Do not try to
write to a wide character stream if wide character support is
disabled in the library.

diff --git a/libstdc++-v3/testsuite/20_util/unique_ptr/io/lwg2948.cc 
b/libstdc++-v3/testsuite/20_util/unique_ptr/io/lwg2948.cc
index ab0b17d2b1c..131bfb24ed7 100644
--- a/libstdc++-v3/testsuite/20_util/unique_ptr/io/lwg2948.cc
+++ b/libstdc++-v3/testsuite/20_util/unique_ptr/io/lwg2948.cc
@@ -73,8 +73,10 @@ template
 
 static_assert( streamable>> );
 static_assert( ! streamable>> );
+#ifdef _GLIBCXX_USE_WCHAR_T
 static_assert( ! streamable>> );
 static_assert( streamable>> );
+#endif
 
 void
 test02()


[PATCH] testsuite/97494 - adjust gcc.dg/vect/slp-11b.c

2021-01-18 Thread Richard Biener
Support for loop SLP splitting exposed that slp-11b.c has
folding that breaks SLP discovery which isn't what was intended
when the testcase was written.  The following makes it SLP-able
and "only" run into the issue that a load permutation is required.

And tries to adjust the target selectors accordingly.

Pushed.  (fingers crossing)

2021-01-18  Richard Biener  

PR testsuite/97494
* gcc.dg/vect/slp-11b.c: Adjust.
---
 gcc/testsuite/gcc.dg/vect/slp-11b.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/slp-11b.c 
b/gcc/testsuite/gcc.dg/vect/slp-11b.c
index 0aece8092a8..c4d9ab0f36b 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-11b.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-11b.c
@@ -12,13 +12,13 @@ main1 ()
   unsigned int out[N*8], a0, a1, a2, a3, a4, a5, a6, a7, b1, b0, b2, b3, b4, 
b5, b6, b7;
   unsigned int in[N*8] = 
{0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
 
-  /* Requires permutation - not SLPable.  */
+  /* Requires permutation for SLP.  */
   for (i = 0; i < N*2; i++)
 {
   out[i*4] = (in[i*4] + 2) * 3;
   out[i*4 + 1] = (in[i*4 + 2] + 2) * 7;
   out[i*4 + 2] = (in[i*4 + 1] + 7) * 3;
-  out[i*4 + 3] = (in[i*4 + 3] + 3) * 4;
+  out[i*4 + 3] = (in[i*4 + 3] + 3) * 7;
 }
 
   /* check results:  */
@@ -27,7 +27,7 @@ main1 ()
   if (out[i*4] !=  (in[i*4] + 2) * 3
  || out[i*4 + 1] != (in[i*4 + 2] + 2) * 7
  || out[i*4 + 2] != (in[i*4 + 1] + 7) * 3
- || out[i*4 + 3] != (in[i*4 + 3] + 3) * 4)
+ || out[i*4 + 3] != (in[i*4 + 3] + 3) * 7)
 abort ();
 }
 
@@ -43,7 +43,5 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { 
vect_strided4 && vect_int_mult } } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" { target { 
! { vect_strided4 && vect_int_mult } } } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" { 
target { ! vect_load_lanes } } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { 
target { vect_load_lanes } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { 
{ vect_strided4 || vect_perm } && vect_int_mult } } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { 
target { vect_perm && vect_int_mult } } } } */
-- 
2.26.2


Re: [PATCH] alias: Fix offset checks involving section anchors [PR92294]

2021-01-18 Thread Richard Sandiford via Gcc-patches
Jan Hubicka  writes:
>> >> 
>> >> Well, in tree-ssa code we do assume these to be either disjoint objects
>> >> or equal (in decl_refs_may_alias_p that continues in case
>> >> compare_base_decls is -1).  I am not sure if we win much by threating
>> >> them differently on RTL level. I would preffer staying consistent here.
>> 
>> Yeah, I see your point.  My concern here was that the fallback case
>> applies to SYMBOL_REFs without decls, which might not have been visible
>> at the tree-ssa level.  E.g. they might be ABI-defined symbols that have
>> no known relation to source-level constructs.
>> 
>> E.g. the small-data base symbol _gp on MIPS points at a fixed offset
>> from the start of the small-data area (0x7ff0 IIRC).  If the target
>> generated rtl code that used _gp directly, we could wrongly assume
>> that _gp+X can't alias BASE+Y when X != Y, even though the real test
>> for small-data BASEs would be whether X + 0x7ff0 != Y.
>> 
>> I don't think that could occur in tree-ssa.  No valid C code would
>> be able to refer directly to _gp in this way.
>> 
>> On the other hand, I don't have a specific example of where this does
>> go wrong, it's just a feeling that it might.  I can drop it if you
>> think that's better.
>
> I would lean towards not disabling optimization when we have no good
> reason for that - we already did it bit too many times in aliasing code
> and it is hard to figure out what optimizations are missed purposefully
> and what are missed just as omission.
>
> We already comitted to a very conservative assumption that every
> external symbol can be alias of another. I think we should have
> originally required units that reffers to same memory location via
> different symbols to declare it explicitly (i.e. make external alias to
> external symbol), but we do not even allow external aliases (symtab
> supports that though) and also it may depend on use of the module what
> symbols are aliased.
>
> We also decided to disable TBAA for direct accesses to decls to allow
> type punning using unions.
>
> This keeps the offset+range check to be only means of disambiguation.
> While for modern programs global arrays are not common, for Fortran
> stuff they are, so I would preffer to not cripple them even more.
> (I am not sure how often the arrays are external though)

OK, the version below drops the new -2 return value and tries to
clarify the comments in compare_base_symbol_refs.

Lightly tested on aarch64-linux-gnu so far.  Does it look OK if
full tests pass?

Thanks,
Richard



memrefs_conflict_p assumes that:

  [XB + XO, XB + XO + XS)

does not alias

  [YB + YO, YB + YO + YS)

whenever:

  [XO, XO + XS)

does not intersect

  [YO, YO + YS)

In other words, the accesses can alias only if XB == YB at runtime.

However, this doesn't cope correctly with section anchors.
For example, if XB is an anchor symbol and YB is at offset
XO from the anchor, then:

  [XB + XO, XB + XO + XS)

overlaps

  [YB, YB + YS)

whatever the value of XO is.  In other words, when doing the
alias check for two symbols whose local definitions are in
the same block, we should apply the known difference between
their block offsets to the intersection test above.

gcc/
PR rtl-optimization/92294
* alias.c (compare_base_symbol_refs): Take an extra parameter
and add the distance between two symbols to it.  Enshrine in
comments that -1 means "either 0 or 1, but we can't tell
which at compile time".
(memrefs_conflict_p): Update call accordingly.
(rtx_equal_for_memref_p): Likewise.  Take the distance between symbols
into account.
---
 gcc/alias.c | 47 +++
 1 file changed, 31 insertions(+), 16 deletions(-)

diff --git a/gcc/alias.c b/gcc/alias.c
index 8d3575e4e27..69e1eb89ac6 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -159,7 +159,8 @@ static tree decl_for_component_ref (tree);
 static int write_dependence_p (const_rtx,
   const_rtx, machine_mode, rtx,
   bool, bool, bool);
-static int compare_base_symbol_refs (const_rtx, const_rtx);
+static int compare_base_symbol_refs (const_rtx, const_rtx,
+HOST_WIDE_INT * = NULL);
 
 static void memory_modified_1 (rtx, const_rtx, void *);
 
@@ -1837,7 +1838,11 @@ rtx_equal_for_memref_p (const_rtx x, const_rtx y)
   return label_ref_label (x) == label_ref_label (y);
 
 case SYMBOL_REF:
-  return compare_base_symbol_refs (x, y) == 1;
+  {
+   HOST_WIDE_INT distance = 0;
+   return (compare_base_symbol_refs (x, y, ) == 1
+   && distance == 0);
+  }
 
 case ENTRY_VALUE:
   /* This is magic, don't go through canonicalization et al.  */
@@ -2172,10 +2177,20 @@ compare_base_decls (tree base1, tree base2)
   return ret;
 }
 
-/* Same as compare_base_decls but for SYMBOL_REF.  */
+/* Compare SYMBOL_REFs X_BASE and Y_BASE.
+
+   - Return 1 if Y_BASE 

[arm,testsuite]: Fix options for vceqz_p64.c and vceqzq_p64.c

2021-01-18 Thread Christophe Lyon via Gcc-patches
These two tests need:
dg-require-effective-target arm_crypto_ok
dg-add-options arm_crypto
because they use intrinsics that need -mfpu=crypto-neon-fp-armv8.

Committed as obvious.

2021-01-18  Christophe Lyon  

gcc/testsuite/
PR target/71233
* gcc.target/arm/simd/vceqz_p64.c: Use arm_crypto options.
* gcc.target/arm/simd/vceqzq_p64.c: Likewise.

diff --git a/gcc/testsuite/gcc.target/arm/simd/vceqz_p64.c
b/gcc/testsuite/gcc.target/arm/simd/vceqz_p64.c
index f26cbff..c6aa6c9 100644
--- a/gcc/testsuite/gcc.target/arm/simd/vceqz_p64.c
+++ b/gcc/testsuite/gcc.target/arm/simd/vceqz_p64.c
@@ -2,7 +2,8 @@

 /* { dg-do compile } */
 /* { dg-options "-save-temps -O2 -fno-inline" } */
-/* { dg-add-options arm_neon } */
+/* { dg-require-effective-target arm_crypto_ok } */
+/* { dg-add-options arm_crypto } */

 #include "arm_neon.h"

diff --git a/gcc/testsuite/gcc.target/arm/simd/vceqzq_p64.c
b/gcc/testsuite/gcc.target/arm/simd/vceqzq_p64.c
index 355efd8..640754c 100644
--- a/gcc/testsuite/gcc.target/arm/simd/vceqzq_p64.c
+++ b/gcc/testsuite/gcc.target/arm/simd/vceqzq_p64.c
@@ -2,7 +2,8 @@

 /* { dg-do compile } */
 /* { dg-options "-save-temps -O2 -fno-inline" } */
-/* { dg-add-options arm_neon } */
+/* { dg-require-effective-target arm_crypto_ok } */
+/* { dg-add-options arm_crypto } */

 #include "arm_neon.h"


Re: [PATCH] libgomp: enable linux-futex on riscv64

2021-01-18 Thread Jakub Jelinek via Gcc-patches
On Mon, Jan 18, 2021 at 03:04:11PM +0100, Andreas Schwab wrote:
> Regtested on riscv64-suse-linux.
> 
> libgomp/
>   * configure.tgt (riscv64*-*-linux*): Add linux to config_path.

Ok, thanks.

> --- a/libgomp/configure.tgt
> +++ b/libgomp/configure.tgt
> @@ -64,6 +64,10 @@ if test x$enable_linux_futex = xyes; then
>   config_path="linux/powerpc linux posix"
>   ;;
>  
> +riscv64*-*-linux*)
> + config_path="linux posix"
> + ;;
> +
>  s390*-*-linux*)
>   config_path="linux/s390 linux posix"
>   ;;
> -- 
> 2.30.0
> 
> 
> -- 
> Andreas Schwab, SUSE Labs, sch...@suse.de
> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
> "And now for something completely different."

Jakub



[PATCH] libgomp: enable linux-futex on riscv64

2021-01-18 Thread Andreas Schwab
Regtested on riscv64-suse-linux.

libgomp/
* configure.tgt (riscv64*-*-linux*): Add linux to config_path.
---
 libgomp/configure.tgt | 4 
 1 file changed, 4 insertions(+)

diff --git a/libgomp/configure.tgt b/libgomp/configure.tgt
index be06be0e52b..fe2bf1dac51 100644
--- a/libgomp/configure.tgt
+++ b/libgomp/configure.tgt
@@ -64,6 +64,10 @@ if test x$enable_linux_futex = xyes; then
config_path="linux/powerpc linux posix"
;;
 
+riscv64*-*-linux*)
+   config_path="linux posix"
+   ;;
+
 s390*-*-linux*)
config_path="linux/s390 linux posix"
;;
-- 
2.30.0


-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH] Modula-2 into the GCC tree on master

2021-01-18 Thread Gaius Mulley via Gcc-patches
Richard Biener  writes:

Hello Richard,

many thanks for taking the time to review the patches and tarball.

> It looks like libgm2 is built independently on whether m2 is enabled or not?
> I'd like to see a while-listing of supported targets like done for example
> for libgomp via configure.tgt or for libgo (see toplevel configure).

ok thanks for the pointers - will do.

> The driver changes have been posted and reviewed previously but I
> didn't see any real OK there but motivational questions - they never
> were posted together with the m2 driver portion (I guess that would
> be gcc/m2/gm2spec.c in the tarball).

yes true.

> I've not seen reviews or postings (besides as tarball) of the frontend
> or the library (but I don't remember seeing extensive reviews of
> other languages frontends or runtime portions at the point of their
> inclusion - still the glueing to the middle-end should get the chance
> to be reviewed).

the glue code for gm2 is in the directory gcc/m2/gm2-gcc which is
implemented in C and associated matching C header and M2 definition
modules.

> I've tried to find my way through gcc/m2 but am quite lost in the
> number of subdirectories.  I do see in gm2-lang.c and elsewhere
> inclusion of system headers outside of system.h which is going
> to be a portability problem.

ah thanks for spotting this - will fix.

> From the parse_file langhook we eventually dispatch to
> init_PerCompilationInit which looks like a Modula-2 scaffolding file?
> Is the compiler written in Modula-2?

yes mainly written in Modula-2, the sources are in gcc/m2/gm2-compiler,
core libraries are in gcc/m2/gm2-libs.  These are converted into C++
files during the build using the translator in gcc/m2/mc (gcc/m2/mc-boot
C++ version).  (For developers the Modula-2 compiler sources can be
built using stage1 gm2 later on).  All libraries are eventually compiled
by gm2 for target of course.

> It's not clear what parts make up the interface to the GCC middle-end.

the interface to the GCC middle-end is in gcc/m2/gm2-gcc which are
called by the front end sources in gcc/m2/gm2-compiler.  For example the
main declarations are performed by gcc/m2/gm2-compiler/M2GCCDeclare.mod
and the code trees are produced by gcc/m2/gm2-compiler/M2GenGCC.mod.

> I'm missing a patch for gcc/doc/install.texi which should list
> requirements plus a patch to sourcebuild.texi listing the new
> toplevel dirs (at least).

ah thank you yes I missed this.

> We don't usally ship "examples" in the GCC source tree,
> there's a gm2-tools directory which name suggests those are
> host tools which should usually reside in the toplevel.

ok sure, maybe best to move the examples into the regression test suite
and move the minimal number of gm2-tools required into the toplevel.

> There's copies of gpl and gpl-3.0.texi files in m2/ but I think
> all .texi stuff (even language specific) should be in gcc/doc/
> and not the lang specific subdirectory.

yes indeed sounds good and clean.

> I've just tried following the merge instructions and a build
> on SUSE Leap 15.2 produces a toplevel m2/ and stage{1,2,3,4}
> directories (empty?!) which hints at some bootstrapping magic taking place?
> In the end the build fails like the following in stage2
>
> bash: ..//home/rguenther/src/trunk/gcc/m2/tools-src/makeversion: No
> such file or directory
> make[3]: *** [/home/rguenther/src/trunk/gcc/m2/Make-lang.in:111:
> gm2version-check] Error 127
> make[3]: *** Waiting for unfinished jobs
> /bin/sh: ..//home/rguenther/src/trunk/gcc/m2/configure: No such file
> or directory
> make[3]: *** [/home/rguenther/src/trunk/gcc/m2/Make-lang.in:1159:
> m2/gm2config.h] Error 127
>
> (sorry, parallel make), re-doing serial make ontop of the above yields
>
> bash: ..//home/rguenther/src/trunk/gcc/m2/tools-src/makeversion: No
> such file or directory
> make[3]: *** [/home/rguenther/src/trunk/gcc/m2/Make-lang.in:111:
> gm2version-check] Error 127
>
> looks like
>
> gm2version-check:
> cd m2 ; bash ../$(srcdir)/m2/tools-src/makeversion -p ../$(srcdir)
> $(STAMP) gm2version-check
>
> is bogus (in particular using $(srcdir) as part of a relative path?)

ah very sorry - yes - I'll fix this.

> I've just done ./configure --enable-languages=m2; make -j24
>
> I would suggest to not rush this in now during stage4
> but instead take the opportunity of this "quiet" phase
> to prepare an integration branch with all the issues above
> sorted out which we can merge at the beginning of stage1
> for GCC 12 (or later during stage4 if everyone is happy
> and/or backport for GCC 11.2 when it landed in trunk).

ok sure - this sounds a good plan

regards,
Gaius


[PATCH] testsuite/97299 - fix test condition of gcc.dg/vect/slp-reduc-3.c

2021-01-18 Thread Richard Biener
This avoids looking for permute optimization when SLP cannot be applied.

Pushed.

2021-01-18  Richard Biener  

PR testsuite/97299
* gcc.dg/vect/slp-reduc-3.c: Guard VEC_PERM_EXPR scan.
---
 gcc/testsuite/gcc.dg/vect/slp-reduc-3.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/slp-reduc-3.c 
b/gcc/testsuite/gcc.dg/vect/slp-reduc-3.c
index 4969fe82b25..fc875865208 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-reduc-3.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-reduc-3.c
@@ -60,4 +60,5 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "vect_recog_dot_prod_pattern: detected" 1 
"vect" { xfail *-*-* } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target { 
vect_short_mult && { vect_widen_sum_hi_to_si  && vect_unpack } } } } } */ 
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { 
xfail { vect_widen_sum_hi_to_si_pattern || { ! vect_unpack } } } } } */
-/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 0 "vect" } } */
+/* Check we can elide permutes if SLP vectorizing the reduction.  */
+/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 0 "vect" { xfail { 
vect_widen_sum_hi_to_si_pattern || { ! vect_unpack } } } } } */
-- 
2.26.2


Re: [PATCH] alias: Fix offset checks involving section anchors [PR92294]

2021-01-18 Thread Jan Hubicka
> >> 
> >> Well, in tree-ssa code we do assume these to be either disjoint objects
> >> or equal (in decl_refs_may_alias_p that continues in case
> >> compare_base_decls is -1).  I am not sure if we win much by threating
> >> them differently on RTL level. I would preffer staying consistent here.
> 
> Yeah, I see your point.  My concern here was that the fallback case
> applies to SYMBOL_REFs without decls, which might not have been visible
> at the tree-ssa level.  E.g. they might be ABI-defined symbols that have
> no known relation to source-level constructs.
> 
> E.g. the small-data base symbol _gp on MIPS points at a fixed offset
> from the start of the small-data area (0x7ff0 IIRC).  If the target
> generated rtl code that used _gp directly, we could wrongly assume
> that _gp+X can't alias BASE+Y when X != Y, even though the real test
> for small-data BASEs would be whether X + 0x7ff0 != Y.
> 
> I don't think that could occur in tree-ssa.  No valid C code would
> be able to refer directly to _gp in this way.
> 
> On the other hand, I don't have a specific example of where this does
> go wrong, it's just a feeling that it might.  I can drop it if you
> think that's better.

I would lean towards not disabling optimization when we have no good
reason for that - we already did it bit too many times in aliasing code
and it is hard to figure out what optimizations are missed purposefully
and what are missed just as omission.

We already comitted to a very conservative assumption that every
external symbol can be alias of another. I think we should have
originally required units that reffers to same memory location via
different symbols to declare it explicitly (i.e. make external alias to
external symbol), but we do not even allow external aliases (symtab
supports that though) and also it may depend on use of the module what
symbols are aliased.

We also decided to disable TBAA for direct accesses to decls to allow
type punning using unions.

This keeps the offset+range check to be only means of disambiguation.
While for modern programs global arrays are not common, for Fortran
stuff they are, so I would preffer to not cripple them even more.
(I am not sure how often the arrays are external though)

Perhaps we could simply test if symbol has associated decl and be
conservative for symbols w/o decl.

It is however just my preference.  I can live with being conservative
everywhere.

Honza
> 
> > So that's because if an alias (via alias attribute) is not visible
> > then it can be assumed to not exist.  Which means the bug is that
> > with section anchors we do not know which variables can be refered
> > to via the specific anchor?
> 
> No, we do know that.
> 
> > (if there's something like a "specific"
> > anchor)  That looks like the actual defect to me?  I see we have
> > block_symbol and object_block which may have all the data needed
> > in case accesses are somehow well-constrained?
> 
> Right.  And the patch does take advantage of that information.
> E.g. the existing:
> 
>   if (SYMBOL_REF_BLOCK (x_base) != SYMBOL_REF_BLOCK (y_base))
>   return 0;
> 
> says that symbols can't alias if they're known to belong to different
> blocks.  And if symbols are in the same block, the behaviour of the
> patch is to adjust the relative offsets to account for the positions
> of the symbols in the block.
> 
> The problem is just that “unequal offsets imply no alias” doesn't
> hold for section anchors.  The offset of the symbol from an anchor
> has to be taken into account.  If we have:
> 
>   (symbol_ref X)
>   (plus (symbol_ref ANCHOR) Y) == unpreempted X
> 
> then ignoring Y gives two false results, a false positive and a false
> negative:
> 
> (a) the existing code might wrongly assume that an access to X+0 could
> alias an access to ANCHOR+0, because the offsets are equal.
> 
> (b) the existing code would wrongly assume that an access to X+0 can't
> alias an access to ANCHOR+Y, because the offsets are unequal.
> 
> The relative offset has to be adjusted by Y first, before applying
> the “unequal offsets imply no alias” rule.
> 
> >> Otheriwse the patch looks good to me.
> >
> > So let's go with it?  It looks like for decl vs. section anchor we
> > can identify the offset of the decl in the anchor block and thus
> > determine a offset adjustment necessary to perform an offset based
> > check, no?
> 
> Yeah, that's what the patch is trying to do.
> 
> Thanks,
> Richard


Re: [PATCH] analyzer: use "malloc" attribute

2021-01-18 Thread Richard Biener
On Sun, 17 Jan 2021, David Malcolm wrote:

> This is an updated version of this patch from October:
> 
>   'RFC: add "deallocated_by" attribute for use by analyzer'
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/44.html
> 
> reworking it to build on top of Martin's work as noted below, reusing
> the existing attribute rather than adding a new one.
> 
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> Also tested by hand with valgrind.
> 
> Apart from a trivial change to attrib.h and to builtins.c, this
> is confined to the analyzer code and docs.  The original patch was
> posted in stage 1.  Is this OK for master?  (I'm hoping for
> release manager permission to commit this code now; it's not
> clear to me whether the end of stage 3 was on the 16th or is
> today on the 17th).

I guess it's still OK if you're quick.

Richard.

> Thanks
> Dave
> 
> In dce6c58db87ebf7f4477bd3126228e73e497 msebor extended the
> "malloc" attribute to support user-defined allocator/deallocator
> pairs.
> 
> This patch extends the "malloc" checker within -fanalyzer to use
> these attributes.  It is based on an earlier patch:
>   'RFC: add "deallocated_by" attribute for use by analyzer'
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/44.html
> which added a different attribute.  I mistakenly thought that it would
> be easy to merge our patches; the patch turned out to need a lot of
> reworking, to support multiple deallocators per allocator.
> 
> My hope was that this would provide a minimal level of markup that would
> support library-checking without requiring lots of further markup.
> I attempted to use this to detect a memory leak within a Linux
> driver (CVE-2019-19078), by adding the attribute to mark these fns:
> extern struct urb *usb_alloc_urb(int iso_packets, gfp_t mem_flags);
> extern void usb_free_urb(struct urb *urb);
> where there is a leak of a "urb" on an error-handling path.
> Unfortunately I ran into the problem that there are various other fns
> that take "struct urb *" and the analyzer conservatively assumes that a
> urb passed to them might or might not be freed and thus stops tracking
> state for them.
> 
> Hence this will only detect issues for the simplest cases (without
> adding another attribute).
> 
> gcc/analyzer/ChangeLog:
>   * analyzer.h (is_std_named_call_p): New decl.
>   * diagnostic-manager.cc (path_builder::get_sm): New.
>   (state_change_event_creator::state_change_event_creator): Add "pb"
>   param.
>   (state_change_event_creator::on_global_state_change): Don't consider
>   state changes affecting other state_machines.
>   (state_change_event_creator::on_state_change): Likewise.
>   (state_change_event_creator::m_pb): New field.
>   (diagnostic_manager::add_events_for_eedge): Pass pb to visitor
>   ctor.
>   * region-model-impl-calls.cc
>   (region_model::impl_deallocation_call): New.
>   * region-model.cc: Include "attribs.h".
>   (region_model::on_call_post): Handle fndecls referenced by
>   __attribute__((deallocated_by(FOO))).
>   * region-model.h (region_model::impl_deallocation_call): New decl.
>   * sm-malloc.cc: Include "stringpool.h" and "attribs.h".  Add
>   leading comment.
>   (class api): Delete.
>   (enum resource_state): Update comment for change from api to
>   deallocator and deallocator_set.
>   (allocation_state::allocation_state): Drop api param.  Add
>   "deallocators" and "deallocator".
>   (allocation_state::m_api): Drop field in favor of...
>   (allocation_state::m_deallocators): New field.
>   (allocation_state::m_deallocator): New field.
>   (enum wording): Add WORDING_DEALLOCATED.
>   (struct deallocator): New.
>   (struct standard_deallocator): New.
>   (struct custom_deallocator): New.
>   (struct deallocator_set): New.
>   (struct custom_deallocator_set): New.
>   (struct standard_deallocator_set): New.
>   (struct deallocator_set_map_traits): New.
>   (malloc_state_machine::m_malloc): Drop field
>   (malloc_state_machine::m_scalar_new): Likewise.
>   (malloc_state_machine::m_vector_new): Likewise.
>   (malloc_state_machine::m_free): New field
>   (malloc_state_machine::m_scalar_delete): Likewise.
>   (malloc_state_machine::m_vector_delete): Likewise.
>   (malloc_state_machine::deallocator_map_t): New typedef.
>   (malloc_state_machine::m_deallocator_map): New field.
>   (malloc_state_machine::deallocator_set_cache_t): New typedef.
>   (malloc_state_machine::m_custom_deallocator_set_cache): New field.
>   (malloc_state_machine::custom_deallocator_set_map_t): New typedef.
>   (malloc_state_machine::m_custom_deallocator_set_map): New field.
>   (malloc_state_machine::m_dynamic_sets): New field.
>   (malloc_state_machine::m_dynamic_deallocators): New field.
>   (api::api): Delete.
>   (deallocator::deallocator): 

Re: [PATCH] keep scope blocks for all inlined functions (PR 98664)

2021-01-18 Thread Richard Biener via Gcc-patches
On Sun, Jan 17, 2021 at 1:46 AM Martin Sebor  wrote:
>
> On 1/15/21 12:44 AM, Richard Biener wrote:
> > On Thu, Jan 14, 2021 at 8:13 PM Martin Sebor via Gcc-patches
> >  wrote:
> >>
> >> One aspect of PR 98465 - Bogus warning stringop-overread for std::string
> >> is the inconsistency between -g and -g0 which turns out to be due to
> >> GCC eliminating apparently unused scope blocks from inlined functions
> >> that aren't explicitly declared inline and artificial.  PR 98664 tracks
> >> just this part of PR 98465.
> >>
> >> To resolve just the PR 98664 subset the attached change has
> >> the tree-ssa-live.c pass preserve these blocks for all inlined
> >> functions, not just artificial ones.  Besides avoiding the interaction
> >> between -g and warnings it also seems to improve the inlining context
> >> by including more inlined call sites.  This can be seen in the adjusted
> >> tests.  (Its effect on PR 98465 is that the false positive is issued
> >> consistently, regardless of -g.  Avoiding the false positive is my
> >> next step.)
> >>
> >> Jakub, you raised a concern yesterday in PR 98465 c#13 about the memory
> >> footprint of this change.  Can you please comment on whether it's in
> >> line with what you were suggesting?
> >
> >   {
> > tree ao = BLOCK_ABSTRACT_ORIGIN (block);
> > -  if (TREE_CODE (ao) == FUNCTION_DECL)
> > -   loc = BLOCK_SOURCE_LOCATION (block);
> > -  else if (TREE_CODE (ao) != BLOCK)
> > -   break;
> > +   if (TREE_CODE (ao) == FUNCTION_DECL)
> > +loc = BLOCK_SOURCE_LOCATION (block);
> > +   else if (TREE_CODE (ao) != BLOCK)
> > +break;
> >
> > you are replacing tabs with spaces?
> >
> > @@ -558,16 +558,13 @@ remove_unused_scope_block_p (tree scope, bool
> > in_ctor_dtor_block)
> >  else if (!flag_auto_profile && debug_info_level == DINFO_LEVEL_NONE
> >  && !optinfo_wants_inlining_info_p ())
> >{
> > -   /* Even for -g0 don't prune outer scopes from artificial
> > - functions, otherwise diagnostics using tree_nonartificial_location
> > - will not be emitted properly.  */
> > +   /* Even for -g0 don't prune outer scopes from inlined functions,
> > + otherwise late diagnostics from such functions will not be
> > + emitted or suppressed properly.  */
> >  if (inlined_function_outer_scope_p (scope))
> >   {
> > tree ao = BLOCK_ORIGIN (scope);
> > -  if (ao
> > -  && TREE_CODE (ao) == FUNCTION_DECL
> > -  && DECL_DECLARED_INLINE_P (ao)
> > -  && lookup_attribute ("artificial", DECL_ATTRIBUTES (ao)))
> > +  if (ao && TREE_CODE (ao) == FUNCTION_DECL)
> >   unused = false;
> >   }
> >}
> >
> > so which inlined_function_outer_scope_p are you _not_ marking now?
> > BLOCK_ORIGIN is never NULL and all inlined scopes should have
> > an abstract origin - I believe always a FUNCTIN_DECL.  Which means
> > you could have simplified it further?
>
> Quite possibly.  I could find no documentation for these macros so
> I tried to keep my changes conservative.  I did put together some
> notes to document what I saw the macros evaluate to in my testing
> (below).  If/when it's close to accurate I'd like to add them to
> tree.h and to the internals manual.
>
> > And yes, the main reason for the code above is memory use for
> > C++ with lots of inlining.  I suggest to try the patch on tramp3d
> > for example (there's about 10 inline instances per emitted
> > assembly op).
>
> This one:
> https://github.com/llvm-mirror/test-suite/tree/master/MultiSource/Benchmarks/tramp3d-v4
> ?

yeah

> With the patch, 69,022 more blocks with distinct numbers are kept
> than without it.  I see some small differences in -fmem-report
> and -ftime-report output:
>
>Total: 286 -> 288M  210 -> 211M  3993 -> 4019k
>
> I'm not really sure what to look at so I attach the two reports
> for you to judge for yourself.

A build with --enable-gather-detailed-mem-stats would have given
statistics on BLOCK trees I think, otherwise -fmem-report is
not too useful but I guess the above overall stat tell us the
overhead is manageable.

> I also attach an updated patch with the slight simplification you
> suggested.

So I was even suggesting to do

  if (inlined_function_outer_scope_p (scope))
unused = false;

and maybe gcc_assert (TREE_CODE (orig) == FUNCTION_DECL)
but I think the patch is OK as updated.

> Martin
>
> PS Here are my notes on the macros and the two related functions:
>
> BLOCK: Denotes a lexical scope.  Contains BLOCK_VARS of variables
> declared in it, BLOCK_SUBBLOCKS of scopes nested in it, and
> BLOCK_CHAIN pointing to the next BLOCK.  Its BLOCK_SUPERCONTEXT
> point to the BLOCK of the enclosing scope.  May have
> a BLOCK_ABSTRACT_ORIGIN and a BLOCK_SOURCE_LOCATION.
>
> BLOCK_SUPERCONTEXT: The scope of the enclosing block, or FUNCTION_DECL
> for the "outermost" function scope.  Inlined 

GCC 11.0.0 Status Report (2021-01-18), Stage 4 in effect now

2021-01-18 Thread Richard Biener


Status
==

GCC trunk which eventually will become GCC 11 is now in
regression and documentation fixes only mode (Stage 4).

Please help triaging and fixing regressions to make a timely
release of GCC 11 possible.


Quality Data


Priority  #   Change from last report
---   ---
P1   62   -  5 
P2  334   +  3 
P3   35   +  1 
P4  190
P5   24
---   ---
Total P1-P3 432   -  1
Total   646   -  1


Previous Report
===

https://gcc.gnu.org/pipermail/gcc/2021-January/234686.html


Re: The performance data for two different implementation of new security feature -ftrivial-auto-var-init

2021-01-18 Thread Richard Sandiford via Gcc-patches
Qing Zhao  writes:
 D will keep all initialized aggregates as aggregates and live which
 means stack will be allocated for it.  With A the usual optimizations
 to reduce stack usage can be applied.
>>> 
>>> I checked the routine “poverties::bump_map” in 511.povray_r since it
>>> has a lot stack increase 
>>> due to implementation D, by examine the IR immediate before RTL
>>> expansion phase.  
>>> (image.cpp.244t.optimized), I found that we have the following
>>> additional statements for the array elements:
>>> 
>>> void  pov::bump_map (double * EPoint, struct TNORMAL * Tnormal, double
>>> * normal)
>>> {
>>> …
>>> double p3[3];
>>> double p2[3];
>>> double p1[3];
>>> float colour3[5];
>>> float colour2[5];
>>> float colour1[5];
>>> …
>>>  # DEBUG BEGIN_STMT
>>> colour1 = .DEFERRED_INIT (colour1, 2);
>>> colour2 = .DEFERRED_INIT (colour2, 2);
>>> colour3 = .DEFERRED_INIT (colour3, 2);
>>> # DEBUG BEGIN_STMT
>>> MEM  [(double[3] *)] = p1$0_144(D);
>>> MEM  [(double[3] *) + 8B] = p1$1_135(D);
>>> MEM  [(double[3] *) + 16B] = p1$2_138(D);
>>> p1 = .DEFERRED_INIT (p1, 2);
>>> # DEBUG D#12 => MEM  [(double[3] *)]
>>> # DEBUG p1$0 => D#12
>>> # DEBUG D#11 => MEM  [(double[3] *) + 8B]
>>> # DEBUG p1$1 => D#11
>>> # DEBUG D#10 => MEM  [(double[3] *) + 16B]
>>> # DEBUG p1$2 => D#10
>>> MEM  [(double[3] *)] = p2$0_109(D);
>>> MEM  [(double[3] *) + 8B] = p2$1_111(D);
>>> MEM  [(double[3] *) + 16B] = p2$2_254(D);
>>> p2 = .DEFERRED_INIT (p2, 2);
>>> # DEBUG D#9 => MEM  [(double[3] *)]
>>> # DEBUG p2$0 => D#9
>>> # DEBUG D#8 => MEM  [(double[3] *) + 8B]
>>> # DEBUG p2$1 => D#8
>>> # DEBUG D#7 => MEM  [(double[3] *) + 16B]
>>> # DEBUG p2$2 => D#7
>>> MEM  [(double[3] *)] = p3$0_256(D);
>>> MEM  [(double[3] *) + 8B] = p3$1_258(D);
>>> MEM  [(double[3] *) + 16B] = p3$2_260(D);
>>> p3 = .DEFERRED_INIT (p3, 2);
>>> ….
>>> }
>>> 
>>> I guess that the above “MEM ….. = …” are the ones that make the
>>> differences. Which phase introduced them?
>> 
>> Looks like SRA. But you can just dump all and grep for the first occurrence. 
>
> Yes, looks like that SRA is the one:
>
> image.cpp.035t.esra:  MEM  [(double[3] *)] = p1$0_195(D);
> image.cpp.035t.esra:  MEM  [(double[3] *) + 8B] = p1$1_182(D);
> image.cpp.035t.esra:  MEM  [(double[3] *) + 16B] = p1$2_185(D);

I realise no-one was suggesting otherwise, but FWIW: SRA could easily
be extended to handle .DEFERRED_INIT if that's the main source of
excess stack usage.  A single .DEFERRED_INIT of an aggregate can
be split into .DEFERRED_INITs of individual components.

In other words, the investigation you're doing looks like the right way
of deciding which passes are worth extending to handle .DEFERRED_INIT.

Thanks,
Richard


[committed] libstdc++: Fix narrow char test to use stringbuf not wstringbuf

2021-01-18 Thread Jonathan Wakely via Gcc-patches
This seems to be a copy & paste error.

libstdc++-v3/ChangeLog:

* testsuite/27_io/basic_stringstream/cons/char/1.cc: Use
stringbuf not wstringbuf.

Tested x86_64-linux. Committed to trunk.

commit a81d2f1e414836549b909f2de927b6ae10e8b156
Author: Jonathan Wakely 
Date:   Mon Jan 18 12:44:27 2021

libstdc++: Fix narrow char test to use stringbuf not wstringbuf

This seems to be a copy & paste error.

libstdc++-v3/ChangeLog:

* testsuite/27_io/basic_stringstream/cons/char/1.cc: Use
stringbuf not wstringbuf.

diff --git a/libstdc++-v3/testsuite/27_io/basic_stringstream/cons/char/1.cc 
b/libstdc++-v3/testsuite/27_io/basic_stringstream/cons/char/1.cc
index 31130ee5c95..7cb9f34ca04 100644
--- a/libstdc++-v3/testsuite/27_io/basic_stringstream/cons/char/1.cc
+++ b/libstdc++-v3/testsuite/27_io/basic_stringstream/cons/char/1.cc
@@ -107,7 +107,7 @@ test04()
   sstream ss3(std::string(str), std::ios::out, a);
   VERIFY( ss3.str() == cstr );
   VERIFY( bool(ss3 << 1) );
-  VERIFY( ss3.get() == std::wstringbuf::traits_type::eof() );
+  VERIFY( ss3.get() == std::stringbuf::traits_type::eof() );
 }
 
 int


Re: [backport gcc10, gcc9] Requet to backport PR97969

2021-01-18 Thread Richard Biener
On Mon, 18 Jan 2021, Przemyslaw Wirkus wrote:

> Hi all,
> 
> Can we backport PR97969 patch to GCC 10 and (maybe) GCC 9 ?:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97969
> 
> IMHO bug is severe and could land in GCC 10 and 9. Vladimir's original patch:
> https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563322.html
> applies without changes to both gcc-10 and gcc-9.
> 
> I've regression tested this patch on both gcc-10 and gcc-9 branched for
> x86_64 cross (arm-eabi target) and no issues.
> 
> OK for gcc-10 and gcc-9 ?

I see two fallout PRs with a trivial search: PR98643 and PR98722.  LRA
patches quite easily trigger unexpected fallout unfortunately ...

Richard.

> PS: I can commit if approved.
> 
> Kind regards,
> Przemyslaw Wirkus
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [PATCH] Modula-2 into the GCC tree on master

2021-01-18 Thread Matthias Klose
On 1/18/21 2:09 AM, Gaius Mulley via Gcc-patches wrote:
> gcc/
> 
> * gcc/brig/brigspec.c (lang_register_spec_functions): Added.
> * gcc/c-family/cppspec.c (lang_register_spec_functions): Added.
> * gcc/c/gccspec.c (lang_register_spec_functions): Added.
> * gcc/cp/g++spec.c (lang_register_spec_functions): Added.
> * gcc/d/d-spec.cc (lang_register_spec_functions): Added.
> * gcc/fortran/gfortranspec.c(lang_register_spec_functions): Added.
> * gcc/gcc.c (allow_linker): Global variable to disable
> linker by the front end.  (xputenv) available externally.
> (xgetenv) New function.  (save_switch) available externally.
> (fe_add_linker_option) New function.  (handle_OPT_B) New function.
> (fe_add_infile) New function.  (fe_mark_compiled) New function.
> (driver_handle_option) call handle_OPT_B.  (print_option) New
> function.  (print_options) New function.  (dbg_options) New function.
> (fe_add_spec_function) New function.  (lookup_spec_function)
> checks front end registered functions.
> (driver::set_up_specs):  call lang_register_spec_functions.
> (maybe_run_linker): Check allow_linker before running the linker.
> * gcc/gcc.h (fe_save_switch): Prototype.
> (handle_OPT_B) Prototype.  (fe_add_infile) Prototype.
> (fe_add_linker_option) Prototype.  (fe_add_spec_function) Prototype.
> (xputenv) Prototype.  (xgetenv) Prototype.  (print_options) Prototype.
> (print_option) Prototype.  (dbg_options) Prototype.
> (lang_register_spec_functions) Prototype.
> (allow_linker): Extern.
> * gcc/go/gospec.c (lang_register_spec_functions): Added.

this is mising the definition of lang_register_spec_functions for the jit build.

2020-03-23  Matthias Klose  

* jit-spec.c (lang_register_spec_functions): New, not used for jit.


--- a/gcc/jit/jit-spec.c
+++ b/gcc/jit/jit-spec.c
@@ -39,3 +39,9 @@ lang_specific_pre_link (void)

 /* Number of extra output files that lang_specific_pre_link may generate.  */
 int lang_specific_extra_outfiles = 0;  /* Not used for jit.  */
+
+/* lang_register_spec_functions.  Not used for jit.  */
+void
+lang_register_spec_functions (void)
+{
+}



Re: [PATCH] alias: Fix offset checks involving section anchors [PR92294]

2021-01-18 Thread Richard Sandiford via Gcc-patches
Richard Biener  writes:
> On Mon, 18 Jan 2021, Jan Hubicka wrote:
>
>> > This is a repost of:
>> > 
>> >   https://gcc.gnu.org/pipermail/gcc-patches/2020-February/539763.html
>> > 
>> > which was initially posted during stage 4.  (And yeah, I only just
>> > missed stage 4 again.)
>> > 
>> > IMO it would be better to fix the bug directly (as the patch tries
>> > to do) instead of wait for a more thorough redesign of this area.
>> > See the end of:
>> > 
>> >   https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540002.html
>> > 
>> > for some stats.
>> > 
>> > Honza: Richard said he'd like your opinion on the patch.
>> > 
>> > 
>> > memrefs_conflict_p has a slightly odd structure.  It first checks
>> > whether two addresses based on SYMBOL_REFs refer to the same object,
>> > with a tristate result:
>> > 
>> >   int cmp = compare_base_symbol_refs (x,y);
>> > 
>> > If the addresses do refer to the same object, we can use offset-based 
>> > checks:
>> > 
>> >   /* If both decls are the same, decide by offsets.  */
>> >   if (cmp == 1)
>> > return offset_overlap_p (c, xsize, ysize);
>> > 
>> > But then, apart from the special case of forced address alignment,
>> > we use an offset-based check even if we don't know whether the
>> > addresses refer to the same object:
>> > 
>> >   /* Assume a potential overlap for symbolic addresses that went
>> > through alignment adjustments (i.e., that have negative
>> > sizes), because we can't know how far they are from each
>> > other.  */
>> >   if (maybe_lt (xsize, 0) || maybe_lt (ysize, 0))
>> >return -1;
>> >   /* If decls are different or we know by offsets that there is no 
>> > overlap,
>> > we win.  */
>> >   if (!cmp || !offset_overlap_p (c, xsize, ysize))
>> >return 0;
>> > 
>> > This somewhat contradicts:
>> > 
>> >   /* In general we assume that memory locations pointed to by different 
>> > labels
>> >  may overlap in undefined ways.  */
>> 
>> I suppose it is becuase the code above check for SYMBOL_REF and not
>> label (that is probably about jumptables and constpool injected into
>> text segment).
>> 
>> I assume this is also bit result of GCC not being very systematic about
>> aliases.  Sometimes it assumes that two different symbols do not point
>> to same object while in other cases it is worried about aliases.
>> 
>> I see that anchors are special since they point to "same object" with
>> different offests.
>> > 
>> > at the end of compare_base_symbol_refs.  In other words, we're taking -1
>> > to mean that either (a) the symbols are equal (via aliasing) or (b) the
>> > references access non-overlapping objects.
>> 
>> I for symbol refs yes, I think so.
>> > 
>> > But even assuming that's true for normal symbols, it doesn't cope
>> > correctly with section anchors.  If a symbol X at ANCHOR+OFFSET is
>> > preemptible, either (a) X = ANCHOR+OFFSET (rather than the X = ANCHOR
>> > assumed above) or (b) X and ANCHOR reference non-overlapping objects.
>> > 
>> > And an offset-based comparison makes no sense for an anchor symbol
>> > vs. a bare symbol with no decl.  If the bare symbol is allowed to
>> > alias other symbols then it can surely alias any symbol in the
>> > anchor's block, so there are multiple anchor offsets that might
>> > induce an alias.
>> > 
>> > This patch therefore replaces the current tristate:
>> > 
>> >   - known equal
>> >   - known independent (two accesses can't alias)
>> >   - equal or independent
>> > 
>> > with:
>> > 
>> >   - known distance apart
>> >   - known independent (two accesses can't alias)
>> >   - known distance apart or independent
>> >   - don't know
>> > 
>> > For safety, the patch puts all bare symbols in the "don't know"
>> > category.  If that turns out to be too conservative, we at least
>> > need that behaviour for combinations involving a bare symbol
>> > and a section anchor.  However, bare symbols should be relatively
>> > rare these days.
>> 
>> Well, in tree-ssa code we do assume these to be either disjoint objects
>> or equal (in decl_refs_may_alias_p that continues in case
>> compare_base_decls is -1).  I am not sure if we win much by threating
>> them differently on RTL level. I would preffer staying consistent here.

Yeah, I see your point.  My concern here was that the fallback case
applies to SYMBOL_REFs without decls, which might not have been visible
at the tree-ssa level.  E.g. they might be ABI-defined symbols that have
no known relation to source-level constructs.

E.g. the small-data base symbol _gp on MIPS points at a fixed offset
from the start of the small-data area (0x7ff0 IIRC).  If the target
generated rtl code that used _gp directly, we could wrongly assume
that _gp+X can't alias BASE+Y when X != Y, even though the real test
for small-data BASEs would be whether X + 0x7ff0 != Y.

I don't think that could occur in tree-ssa.  No valid C code would
be able to refer directly to _gp in this way.

On the other hand, I don't 

Re: [PATCH] [WIP] openmp: Add OpenMP 5.0 task detach clause support

2021-01-18 Thread Rainer Orth
Hi Jakub,

> On Sun, Jan 17, 2021 at 04:25:24PM +0100, Andreas Schwab wrote:
>> On Jan 17 2021, Jakub Jelinek via Gcc-patches wrote:
>> 
>> > Kwok, I guess you can reproduce it even on Linux with --disable-linux-futex
>> 
>> And all targets that are not explicitly configured in
>> libcomp/configure.tgt, where --enable-linux-futex is a no-op.
>
> Completely untested patch (except for the linux futex version; and RTEMS
> stuff is missing; I think it doesn't have a function for it but has a
> counter in the struct, so perhaps fetch it manually from there), it is
> Sunday, don't want to do more tonight:

this worked for me on both i386-pc-solaris2.11 and
sparc-sun-solaris2.11, thanks.  However, I had to rerun the builds with
the DWARF-5 patch backed out since that caused so much breakage that the
results were all but useless.

Two comments, though:

> --- libgomp/config/linux/sem.h.jj 2021-01-04 10:25:56.160037625 +0100
> +++ libgomp/config/linux/sem.h2021-01-17 16:49:39.900750416 +0100
> @@ -85,4 +85,13 @@ gomp_sem_post (gomp_sem_t *sem)
>if (__builtin_expect (count & SEM_WAIT, 0))
>  gomp_sem_post_slow (sem);
>  }
> +
> +static inline int
> +gomp_sem_getcount (gomp_sem_t *sem)
> +{
> +  int count = __atomic_load_n (sem, MEMMODEL_RELAXED);
> +  if ((count & SEM_WAIT) != 0)
> +return -1;
> +  return count / SEM_INC;
> +}
>  #endif /* GOMP_SEM_H */
> --- libgomp/config/posix/sem.h.jj 2021-01-04 10:25:56.166037557 +0100
> +++ libgomp/config/posix/sem.h2021-01-17 16:49:53.605593659 +0100
> @@ -64,6 +64,8 @@ extern void gomp_sem_post (gomp_sem_t *s
>  
>  extern void gomp_sem_destroy (gomp_sem_t *sem);
>  
> +extern int gomp_sem_getcount (gomp_sem_t *sem);
> +
>  #else /* HAVE_BROKEN_POSIX_SEMAPHORES  */
>  
>  typedef sem_t gomp_sem_t;
> @@ -84,5 +86,13 @@ static inline void gomp_sem_destroy (gom
>  {
>sem_destroy (sem);
>  }
> +
> +static inline int gomp_sem_getcount (gomp_sem_t *sem)

Shouldn't there be a line break before gomp_semp_getcount (and once
again in posix/sem.c), as done in linux/sem.h above?  libgomp seems a
bit inconsistent in that matter, though.

Besides, I've seen regular timeouts on both Solaris and Linux/x86_64 for
one of the new tests:

WARNING: libgomp.fortran/task-detach-6.f90   -O2  execution test program timed 
out.
FAIL: libgomp.fortran/task-detach-6.f90   -O2  execution test

It doesn't happen every time when manually running the test, but every
third or forth time.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: BoF DWARF5 patches (25% .debug section size reduction)

2021-01-18 Thread Mark Wielaard
> On 15/01/2021 17:16, Jakub Jelinek via Gcc-patches wrote:
> > On Sun, Nov 15, 2020 at 11:41:24PM +0100, Mark Wielaard wrote:
> >> From: Mark Wielaard
> >> Date: Tue, 29 Sep 2020 15:52:44 +0200
> >> Subject: [PATCH] Default to DWARF5
> >>
> >> gcc/ChangeLog:
> >>
> >>* common.opt (gdwarf-): Init(5).
> >>* doc/invoke.texi (-gdwarf): Document default to 5.
> > Ok for trunk.
> 
> I noticed a build error with aarch64-rtems recently:
> [...]
> -fdata-sections -frandom-seed=cxx11-ios_failure.lo -g -O2 -g0 -c 
> cxx11-ios_failure-lt.s -o cxx11-ios_failure.o
> cxx11-ios_failure-lt.s: Assembler messages:
> cxx11-ios_failure-lt.s:38443: Error: file number less than one
> 
> This is the related code:
> 
> .Ldebug_line0:
>  .file 0 
> "/tmp/sh/b-gcc-git-aarch64-rtems6/aarch64-rtems6/libstdc++-
> v3/src/c++11" 
> "/home/EB/sebastian_h/src/gcc/libstdc++-v3/src/c++11/cxx11-
> ios_failure.cc"
>  .section.debug_str,"MS",@progbits,1
> 
> I am not sure if this is related to the change

It was, sorry. I thought I had tested with both old and new binutils,
but apparently I messed up and didn't properly test with new binutils.

This is tracked as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98708
and Jakub pushed a workaround:
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563744.html

Cheers,

Mark


Re: [PATCH] [PR rtl/optimization/98694] Fix incorrect optimization by cprop_hardreg.

2021-01-18 Thread Richard Sandiford via Gcc-patches
Hongtao Liu  writes:
> On Mon, Jan 18, 2021 at 6:18 PM Richard Sandiford
>  wrote:
>>
>> Hongtao Liu via Gcc-patches  writes:
>> > Hi:
>> >   If SRC had been assigned a mode narrower than the copy, we can't link
>> > DEST into the chain even they have same
>> > hard_regno_nregs(i.e. HImode/SImode in i386 backend).
>>
>> In general, changes between modes within the same hard register are OK.
>> Could you explain in more detail what's going wrong?
>>
>
> cprop hardreg change
>
> (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
> (reg:SI 37 r9 [orig:86 _11 ] [86])) "test.c":29:36 75 
> {*movsi_internal}
>  (expr_list:REG_DEAD (reg:SI 37 r9 [orig:86 _11 ] [86])
> (nil)))
>
> to
>
> (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
> (reg:SI 22 xmm2 [orig:86 _11 ] [86])) "test.c":29:36 75
> {*movsi_internal}
>  (expr_list:REG_DEAD (reg:SI 22 xmm2 [orig:86 _11 ] [86])
> (nil)))
>
> since (reg:SI 22 xmm2) and (reg:SI r9) are in the same value chain in
> which the oldest regno is k0.
>
> but with xmm2 defined as
>
> kmovw %k0, %edi  # 69 [c=4 l=4] *movhi_internal/6- kmovw move the
> lower 16bits to %edi, and clear the upper 16 bits.
> vmovd %edi, %xmm2 # 489 *movsi_internal  --- vmovd move 32bits from
> %edi to %xmm2.
>
> (insn 69 68 70 12 (set (reg:HI 5 di [orig:96 _52 ] [96])
> (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "test.c":21:23 76
> {*movhi_internal}
>  (nil))
>
> (insn 489 75 78 12 (set (reg:SI 22 xmm2 [297])
> (reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal}
>  (nil))

The sequence is OK in itself, but insn 489 can't make any assumptions
about what's in the upper 16 bits of %edi.  In other words, as far as
RTL semantics are concerned, insn 489 only leaves bits 0-15 of %xmm2
with defined values; the other bits are undefined.

If the target wants all 32 bits of %edi to be carried over to insn 489
then it needs to make insn 69 an SImode set instead of a HImode set.

So what cprop is doing is OK: it's changing the values of undefined
bits but not changing the definition of defined bits (from an RTL
point of view).

Thanks,
Richard


Re: [r11-6755 Regression] FAIL: libstdc++-prettyprinters/libfundts.cc print os on Linux/x86_64

2021-01-18 Thread Mark Wielaard
Hi,

On Sun, Jan 17, 2021 at 11:30:59AM -0800, sunil.k.pandey wrote:
> On Linux/x86_64,
> 
> 3804e937b0e252a7e42632fe6d9f898f1851a49c is the first bad commit
> commit 3804e937b0e252a7e42632fe6d9f898f1851a49c
> Author: Mark Wielaard 
> Date:   Tue Sep 29 15:52:44 2020 +0200
> 
> Default to DWARF5
> 
> caused
> 
> 
> with GCC configured with
> 
> 

Thanks for the report, although it looks like some information about the
cause and configuration is missing from the above.

This is being tracked as
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98716

The asan related failures seem to be caused by confusion about the
actual file an error occured in when -flto and -gdwarf-5 is used.
Possibly a libbacktrace issue.

The pretty printer seems to not recognize
std::list, std::allocator >
as std::list

Cheers,

Mark 


Re: [PATCH] aix: Default to DWARF 4

2021-01-18 Thread Mark Wielaard
Hi David,

On Sun, Jan 17, 2021 at 06:12:06PM -0500, David Edelsohn wrote:
> GCC now defaults to DWARF 5.  AIX only supports DWARF 4 (3.5).
> 
> This patch overrides the default DWARF version to 4 unless explicitly
> stated.
> 
> gcc/ChangeLog:
> 
> * config/rs6000/aix71.h (SUBTARGET_OVERRIDE_OPTIONS): Override
> dwarf_version to 4.
> * config/rs6000/aix72.h (SUBTARGET_OVERRIDE_OPTIONS): Same.

Thanks, I hadn't tested against AIX.  Could you also update
gcc/doc/invoke.texi (-gdwarf) with the defaults for AIX?

Thanks,

Mark


Re: [PATCH] [PR rtl/optimization/98694] Fix incorrect optimization by cprop_hardreg.

2021-01-18 Thread Hongtao Liu via Gcc-patches
On Mon, Jan 18, 2021 at 6:43 PM Hongtao Liu  wrote:
>
> On Mon, Jan 18, 2021 at 6:18 PM Richard Sandiford
>  wrote:
> >
> > Hongtao Liu via Gcc-patches  writes:
> > > Hi:
> > >   If SRC had been assigned a mode narrower than the copy, we can't link
> > > DEST into the chain even they have same
> > > hard_regno_nregs(i.e. HImode/SImode in i386 backend).
> >
> > In general, changes between modes within the same hard register are OK.
> > Could you explain in more detail what's going wrong?

For simplicity, If the copy of narrow mode has the side effect of
clearing the upper bits of the same hard register, But this behavior
is not described in the insn pattern, shouldn't it be wrong to add
different modes to the same value chain.

> >
>
> cprop hardreg change
>
> (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
> (reg:SI 37 r9 [orig:86 _11 ] [86])) "test.c":29:36 75 
> {*movsi_internal}
>  (expr_list:REG_DEAD (reg:SI 37 r9 [orig:86 _11 ] [86])
> (nil)))
>
> to
>
> (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
> (reg:SI 22 xmm2 [orig:86 _11 ] [86])) "test.c":29:36 75
> {*movsi_internal}
>  (expr_list:REG_DEAD (reg:SI 22 xmm2 [orig:86 _11 ] [86])
> (nil)))
>
> since (reg:SI 22 xmm2) and (reg:SI r9) are in the same value chain in
> which the oldest regno is k0.
>
> but with xmm2 defined as
>
> kmovw %k0, %edi  # 69 [c=4 l=4] *movhi_internal/6- kmovw move the
> lower 16bits to %edi, and clear the upper 16 bits.
> vmovd %edi, %xmm2 # 489 *movsi_internal  --- vmovd move 32bits from
> %edi to %xmm2.
>
> (insn 69 68 70 12 (set (reg:HI 5 di [orig:96 _52 ] [96])
> (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "test.c":21:23 76
> {*movhi_internal}
>  (nil))
>
> (insn 489 75 78 12 (set (reg:SI 22 xmm2 [297])
> (reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal}
>  (nil))
> ...
> kmovd %k0, %r9d (movsi)  kmovd move 32bits from %k0 to %r9d.
>
> for %edi, bit 16-31 is cleared by kmovw which means %r9d is not equal
> to %xmm2 as a SImode value.
>
> > Thanks,
> > Richard
> >
> >
> > >
> > > i.e
> > > kmovw   %k0, %edi
> > > vmovd   %edi, %xmm2
> > > vpshuflw$0, %xmm2, %xmm0
> > > kmovw   %k0, %r8d
> > > kmovd   %k0, %r9d
> > > ...
> > > -movl %r9d, %r11d
> > > +vmovd %xmm2, %r11d
> > >
> > >   Bootstrap and regtested on x86_64-linux-gnu{-m32,}.
> > >   Ok for trunk?
> > >
> > > gcc/ChangeLog:
> > >
> > > PR rtl-optimization/98694
> > > * regcprop.c (copy_value): If SRC had been assigned a mode
> > > narrower than the copy, we can't link DEST into the chain even
> > > they have same hard_regno_nregs(i.e. HImode/SImode in i386
> > > backend).
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > PR rtl-optimization/98694
> > > * gcc.target/i386/pr98694.c: New test.
> > >
> > >   ---
> > >  gcc/regcprop.c  |  3 +-
> > >  gcc/testsuite/gcc.target/i386/pr98694.c | 38 +
> > >  2 files changed, 40 insertions(+), 1 deletion(-)
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr98694.c
> > >
> > > diff --git a/gcc/regcprop.c b/gcc/regcprop.c
> > > index dd62cb36013..997516eca07 100644
> > > --- a/gcc/regcprop.c
> > > +++ b/gcc/regcprop.c
> > > @@ -355,7 +355,8 @@ copy_value (rtx dest, rtx src, struct value_data *vd)
> > >/* If SRC had been assigned a mode narrower than the copy, we can't
> > >   link DEST into the chain, because not all of the pieces of the
> > >   copy came from oldest_regno.  */
> > > -  else if (sn > hard_regno_nregs (sr, vd->e[sr].mode))
> > > +  else if (sn > hard_regno_nregs (sr, vd->e[sr].mode)
> > > +  || partial_subreg_p (vd->e[sr].mode, GET_MODE (src)))
> > >  return;
> > >
> > >/* Link DR at the end of the value chain used by SR.  */
> > > diff --git a/gcc/testsuite/gcc.target/i386/pr98694.c
> > > b/gcc/testsuite/gcc.target/i386/pr98694.c
> > > new file mode 100644
> > > index 000..611f9e77627
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/i386/pr98694.c
> > > @@ -0,0 +1,38 @@
> > > +/* PR rtl-optimization/98694 */
> > > +/* { dg-do run { target { ! ia32 } } } */
> > > +/* { dg-options "-O2 -mavx512bw" } */
> > > +/* { dg-require-effective-target avx512bw } */
> > > +
> > > +#include
> > > +typedef short v4hi __attribute__ ((vector_size (8)));
> > > +typedef int v2si __attribute__ ((vector_size (8)));
> > > +v4hi b;
> > > +
> > > +__attribute__ ((noipa))
> > > +v2si
> > > +foo (__m512i src1, __m512i src2)
> > > +{
> > > +  __mmask64 m = _mm512_cmpeq_epu8_mask (src1, src2);
> > > +  short s = (short) m;
> > > +  int i = (int)m;
> > > +  b = __extension__ (v4hi) {s, s, s, s};
> > > +  return __extension__ (v2si) {i, i};
> > > +}
> > > +
> > > +int main ()
> > > +{
> > > +  __m512i src1 = _mm512_setzero_si512 ();
> > > +  __m512i src2 = _mm512_set_epi8 (0, 1, 0, 1, 0, 1, 0, 1,
> > > +

[backport gcc10, gcc9] Requet to backport PR97969

2021-01-18 Thread Przemyslaw Wirkus via Gcc-patches
Hi all,

Can we backport PR97969 patch to GCC 10 and (maybe) GCC 9 ?:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97969

IMHO bug is severe and could land in GCC 10 and 9. Vladimir's original patch:
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563322.html
applies without changes to both gcc-10 and gcc-9.

I've regression tested this patch on both gcc-10 and gcc-9 branched for
x86_64 cross (arm-eabi target) and no issues.

OK for gcc-10 and gcc-9 ?

PS: I can commit if approved.

Kind regards,
Przemyslaw Wirkus



Re: [PATCH] [PR rtl/optimization/98694] Fix incorrect optimization by cprop_hardreg.

2021-01-18 Thread Hongtao Liu via Gcc-patches
On Mon, Jan 18, 2021 at 6:18 PM Richard Sandiford
 wrote:
>
> Hongtao Liu via Gcc-patches  writes:
> > Hi:
> >   If SRC had been assigned a mode narrower than the copy, we can't link
> > DEST into the chain even they have same
> > hard_regno_nregs(i.e. HImode/SImode in i386 backend).
>
> In general, changes between modes within the same hard register are OK.
> Could you explain in more detail what's going wrong?
>

cprop hardreg change

(insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
(reg:SI 37 r9 [orig:86 _11 ] [86])) "test.c":29:36 75 {*movsi_internal}
 (expr_list:REG_DEAD (reg:SI 37 r9 [orig:86 _11 ] [86])
(nil)))

to

(insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
(reg:SI 22 xmm2 [orig:86 _11 ] [86])) "test.c":29:36 75
{*movsi_internal}
 (expr_list:REG_DEAD (reg:SI 22 xmm2 [orig:86 _11 ] [86])
(nil)))

since (reg:SI 22 xmm2) and (reg:SI r9) are in the same value chain in
which the oldest regno is k0.

but with xmm2 defined as

kmovw %k0, %edi  # 69 [c=4 l=4] *movhi_internal/6- kmovw move the
lower 16bits to %edi, and clear the upper 16 bits.
vmovd %edi, %xmm2 # 489 *movsi_internal  --- vmovd move 32bits from
%edi to %xmm2.

(insn 69 68 70 12 (set (reg:HI 5 di [orig:96 _52 ] [96])
(reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "test.c":21:23 76
{*movhi_internal}
 (nil))

(insn 489 75 78 12 (set (reg:SI 22 xmm2 [297])
(reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal}
 (nil))
...
kmovd %k0, %r9d (movsi)  kmovd move 32bits from %k0 to %r9d.

for %edi, bit 16-31 is cleared by kmovw which means %r9d is not equal
to %xmm2 as a SImode value.

> Thanks,
> Richard
>
>
> >
> > i.e
> > kmovw   %k0, %edi
> > vmovd   %edi, %xmm2
> > vpshuflw$0, %xmm2, %xmm0
> > kmovw   %k0, %r8d
> > kmovd   %k0, %r9d
> > ...
> > -movl %r9d, %r11d
> > +vmovd %xmm2, %r11d
> >
> >   Bootstrap and regtested on x86_64-linux-gnu{-m32,}.
> >   Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> > PR rtl-optimization/98694
> > * regcprop.c (copy_value): If SRC had been assigned a mode
> > narrower than the copy, we can't link DEST into the chain even
> > they have same hard_regno_nregs(i.e. HImode/SImode in i386
> > backend).
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR rtl-optimization/98694
> > * gcc.target/i386/pr98694.c: New test.
> >
> >   ---
> >  gcc/regcprop.c  |  3 +-
> >  gcc/testsuite/gcc.target/i386/pr98694.c | 38 +
> >  2 files changed, 40 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr98694.c
> >
> > diff --git a/gcc/regcprop.c b/gcc/regcprop.c
> > index dd62cb36013..997516eca07 100644
> > --- a/gcc/regcprop.c
> > +++ b/gcc/regcprop.c
> > @@ -355,7 +355,8 @@ copy_value (rtx dest, rtx src, struct value_data *vd)
> >/* If SRC had been assigned a mode narrower than the copy, we can't
> >   link DEST into the chain, because not all of the pieces of the
> >   copy came from oldest_regno.  */
> > -  else if (sn > hard_regno_nregs (sr, vd->e[sr].mode))
> > +  else if (sn > hard_regno_nregs (sr, vd->e[sr].mode)
> > +  || partial_subreg_p (vd->e[sr].mode, GET_MODE (src)))
> >  return;
> >
> >/* Link DR at the end of the value chain used by SR.  */
> > diff --git a/gcc/testsuite/gcc.target/i386/pr98694.c
> > b/gcc/testsuite/gcc.target/i386/pr98694.c
> > new file mode 100644
> > index 000..611f9e77627
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr98694.c
> > @@ -0,0 +1,38 @@
> > +/* PR rtl-optimization/98694 */
> > +/* { dg-do run { target { ! ia32 } } } */
> > +/* { dg-options "-O2 -mavx512bw" } */
> > +/* { dg-require-effective-target avx512bw } */
> > +
> > +#include
> > +typedef short v4hi __attribute__ ((vector_size (8)));
> > +typedef int v2si __attribute__ ((vector_size (8)));
> > +v4hi b;
> > +
> > +__attribute__ ((noipa))
> > +v2si
> > +foo (__m512i src1, __m512i src2)
> > +{
> > +  __mmask64 m = _mm512_cmpeq_epu8_mask (src1, src2);
> > +  short s = (short) m;
> > +  int i = (int)m;
> > +  b = __extension__ (v4hi) {s, s, s, s};
> > +  return __extension__ (v2si) {i, i};
> > +}
> > +
> > +int main ()
> > +{
> > +  __m512i src1 = _mm512_setzero_si512 ();
> > +  __m512i src2 = _mm512_set_epi8 (0, 1, 0, 1, 0, 1, 0, 1,
> > + 0, 1, 0, 1, 0, 1, 0, 1,
> > + 0, 1, 0, 1, 0, 1, 0, 1,
> > + 0, 1, 0, 1, 0, 1, 0, 1,
> > + 0, 1, 0, 1, 0, 1, 0, 1,
> > + 0, 1, 0, 1, 0, 1, 0, 1,
> > + 0, 1, 0, 1, 0, 1, 0, 1,
> > + 0, 1, 0, 1, 0, 1, 0, 1);
> > +  __mmask64 m = _mm512_cmpeq_epu8_mask (src1, src2);
> > +  v2si a = foo (src1, src2);
> > +  if (a[0] != (int)m)
> > +__builtin_abort ();

Re: [PATCH] Modula-2 into the GCC tree on master

2021-01-18 Thread Richard Biener via Gcc-patches
On Mon, Jan 18, 2021 at 2:09 AM Gaius Mulley via Gcc-patches
 wrote:
>
>
> Hello,
>
> here is a patch which merges the gm2 front end into the GCC tree.  The
> patches have been boostrapped under x86_64 GNU/Linux Debian Stretch
> built using make -j 24 and also under x86_64 GNU/Linux Debian Buster
> using make -j 4.
>
> Tested on Debian Stretch x86_64
> ===
>
> built GCC bootstrap 3 times:
>
> 1.  built vanilla GCC (enabling bootstrap) enabling front ends:
> brig,c,c++,go,d,fortran and ran the regression tests.
>
> 2.  the patches below were applied and associated tarball untarred.
> The same front ends brig,c,c++,go,d,fortran (again building from
> bootstrap) were enabled (no m2) and ran the regression tests.
> There were no changes to the regression test results between 1 and
> 2.
>
> 3.  Then it was rebuilt (from bootstrap) enabling the front ends
> brig,c,c++,go,d,fortran,m2 and ran the
> regression tests and again no extra failures were seen.
>
> [should I also be testing ada?]
>
> Built on Debian Buster x86_64
> =
>
> Built a patched tree enabling bootstrap make -j 4 for front ends c,c++,m2
> all compiled and bootstrapped.
>
> How to merge
> 
>
> 1.  apply patches below to the master GCC tree.
>
> 2.  cd gcc-git-top
> wget 
> http://floppsie.comp.glam.ac.uk/download/c/gm2-front-end-20210116-tar.gz
> tar zxf gm2-front-end-20210116-tar.gz
> rm gm2-front-end-20210116-tar.gz
> # new directories libgm2 and gcc/m2 are created and populated
>
> 3.  cd gcc-git-top
> autogen Makefile.def
> autoconf
> cd libgm2
> /bin/sh ./autogen.sh
>
> when built this implements iso, pim2, pim3 and pim4 editions of Modula-2
> with access to GCC features (gcc/m2/gm2.texi).
>
> hope this is useful - enjoy,

It looks like libgm2 is built independently on whether m2 is enabled or not?
I'd like to see a while-listing of supported targets like done for example
for libgomp via configure.tgt or for libgo (see toplevel configure).

The driver changes have been posted and reviewed previously but I
didn't see any real OK there but motivational questions - they never
were posted together with the m2 driver portion (I guess that would
be gcc/m2/gm2spec.c in the tarball).

I've not seen reviews or postings (besides as tarball) of the frontend
or the library (but I don't remember seeing extensive reviews of
other languages frontends or runtime portions at the point of their
inclusion - still the glueing to the middle-end should get the chance
to be reviewed).

I've tried to find my way through gcc/m2 but am quite lost in the
number of subdirectories.  I do see in gm2-lang.c and elsewhere
inclusion of system headers outside of system.h which is going
to be a portability problem.  From the parse_file langhook we
eventually dispatch to init_PerCompilationInit which looks
like a Modula-2 scaffolding file?  Is the compiler written in Modula-2?
It's not clear what parts make up the interface to the GCC middle-end.

I'm missing a patch for gcc/doc/install.texi which should list
requirements plus a patch to sourcebuild.texi listing the new
toplevel dirs (at least).

We don't usally ship "examples" in the GCC source tree,
there's a gm2-tools directory which name suggests those are
host tools which should usually reside in the toplevel.

There's copies of gpl and gpl-3.0.texi files in m2/ but I think
all .texi stuff (even language specific) should be in gcc/doc/
and not the lang specific subdirectory.

I've just tried following the merge instructions and a build
on SUSE Leap 15.2 produces a toplevel m2/ and stage{1,2,3,4}
directories (empty?!) which hints at some bootstrapping magic taking place?
In the end the build fails like the following in stage2

bash: ..//home/rguenther/src/trunk/gcc/m2/tools-src/makeversion: No
such file or directory
make[3]: *** [/home/rguenther/src/trunk/gcc/m2/Make-lang.in:111:
gm2version-check] Error 127
make[3]: *** Waiting for unfinished jobs
/bin/sh: ..//home/rguenther/src/trunk/gcc/m2/configure: No such file
or directory
make[3]: *** [/home/rguenther/src/trunk/gcc/m2/Make-lang.in:1159:
m2/gm2config.h] Error 127

(sorry, parallel make), re-doing serial make ontop of the above yields

bash: ..//home/rguenther/src/trunk/gcc/m2/tools-src/makeversion: No
such file or directory
make[3]: *** [/home/rguenther/src/trunk/gcc/m2/Make-lang.in:111:
gm2version-check] Error 127

looks like

gm2version-check:
cd m2 ; bash ../$(srcdir)/m2/tools-src/makeversion -p ../$(srcdir)
$(STAMP) gm2version-check

is bogus (in particular using $(srcdir) as part of a relative path?)

I've just done ./configure --enable-languages=m2; make -j24

I would suggest to not rush this in now during stage4
but instead take the opportunity of this "quiet" phase
to prepare an integration branch with all the issues above
sorted out which we can merge at the beginning of stage1
for GCC 12 (or later during stage4 if 

Re: BoF DWARF5 patches (25% .debug section size reduction)

2021-01-18 Thread Sebastian Huber

On 15/01/2021 17:16, Jakub Jelinek via Gcc-patches wrote:


On Sun, Nov 15, 2020 at 11:41:24PM +0100, Mark Wielaard wrote:

On Tue, 2020-09-29 at 15:56 +0200, Mark Wielaard wrote:

On Thu, 2020-09-10 at 13:16 +0200, Jakub Jelinek wrote:

On Wed, Sep 09, 2020 at 09:57:54PM +0200, Mark Wielaard wrote:

--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9057,13 +9057,14 @@ possible.
  @opindex gdwarf
  Produce debugging information in DWARF format (if that is supported).
  The value of @var{version} may be either 2, 3, 4 or 5; the default version
-for most targets is 4.  DWARF Version 5 is only experimental.
+for most targets is 5 (with the exception of vxworks and darwin which
+default to version 2).

I think in documentation we should spell these VxWorks and Darwin/Mac OS X

OK. As attached.

Are we ready to flip the default to 5?

Ping. It would be good to get this in now so that we can fix issues (if
any) with the DWARF5 support in the general bugfixing stage 3.

Thanks,

Mark
 From c04727b6209ad4d52d1b9ba86873961bda0e1724 Mon Sep 17 00:00:00 2001
From: Mark Wielaard
Date: Tue, 29 Sep 2020 15:52:44 +0200
Subject: [PATCH] Default to DWARF5

gcc/ChangeLog:

* common.opt (gdwarf-): Init(5).
* doc/invoke.texi (-gdwarf): Document default to 5.

Ok for trunk.


I noticed a build error with aarch64-rtems recently:

libtool: compile: 
/home/EB/sebastian_h/src/rtems-source-builder/rtems/build/aarch64-rtems7-gcc-0f951b3-newlib-9ad86f6-x86_64-linux-gnu-1/build/./gcc/xgcc 
-shared-libgcc 
-B/home/EB/sebastian_h/src/rtems-source-builder/rtems/build/aarch64-rtems7-gcc-0f951b3-newlib-9ad86f6-x86_64-linux-gnu-1/build/./gcc 
-nostdinc++ 
-L/home/EB/sebastian_h/src/rtems-source-builder/rtems/build/aarch64-rtems7-gcc-0f951b3-newlib-9ad86f6-x86_64-linux-gnu-1/build/aarch64-rtems7/libstdc++-v3/src 
-L/home/EB/sebastian_h/src/rtems-source-builder/rtems/build/aarch64-rtems7-gcc-0f951b3-newlib-9ad86f6-x86_64-linux-gnu-1/build/aarch64-rtems7/libstdc++-v3/src/.libs 
-L/home/EB/sebastian_h/src/rtems-source-builder/rtems/build/aarch64-rtems7-gcc-0f951b3-newlib-9ad86f6-x86_64-linux-gnu-1/build/aarch64-rtems7/libstdc++-v3/libsupc++/.libs 
-nostdinc 
-B/home/EB/sebastian_h/src/rtems-source-builder/rtems/build/aarch64-rtems7-gcc-0f951b3-newlib-9ad86f6-x86_64-linux-gnu-1/build/aarch64-rtems7/newlib/ 
-isystem 
/home/EB/sebastian_h/src/rtems-source-builder/rtems/build/aarch64-rtems7-gcc-0f951b3-newlib-9ad86f6-x86_64-linux-gnu-1/build/aarch64-rtems7/newlib/targ-include 
-isystem 
/home/EB/sebastian_h/src/rtems-source-builder/rtems/build/aarch64-rtems7-gcc-0f951b3-newlib-9ad86f6-x86_64-linux-gnu-1/gnu-mirror-gcc-0f951b3/newlib/libc/include 
-B/tmp/sh/rtems/7/aarch64-rtems7/bin/ 
-B/tmp/sh/rtems/7/aarch64-rtems7/lib/ -isystem 
/tmp/sh/rtems/7/aarch64-rtems7/include -isystem 
/tmp/sh/rtems/7/aarch64-rtems7/sys-include 
-I/home/EB/sebastian_h/src/rtems-source-builder/rtems/build/aarch64-rtems7-gcc-0f951b3-newlib-9ad86f6-x86_64-linux-gnu-1/gnu-mirror-gcc-0f951b3/libstdc++-v3/../libgcc 
-I/home/EB/sebastian_h/src/rtems-source-builder/rtems/build/aarch64-rtems7-gcc-0f951b3-newlib-9ad86f6-x86_64-linux-gnu-1/build/aarch64-rtems7/libstdc++-v3/include/aarch64-rtems7 
-I/home/EB/sebastian_h/src/rtems-source-builder/rtems/build/aarch64-rtems7-gcc-0f951b3-newlib-9ad86f6-x86_64-linux-gnu-1/build/aarch64-rtems7/libstdc++-v3/include 
-I/home/EB/sebastian_h/src/rtems-source-builder/rtems/build/aarch64-rtems7-gcc-0f951b3-newlib-9ad86f6-x86_64-linux-gnu-1/gnu-mirror-gcc-0f951b3/libstdc++-v3/libsupc++ 
-std=gnu++11 -fno-implicit-templates -Wall -Wextra -Wwrite-strings 
-Wcast-qual -Wabi=2 -fdiagnostics-show-location=once -ffunction-sections 
-fdata-sections -frandom-seed=cxx11-ios_failure.lo -g -O2 -g0 -c 
cxx11-ios_failure-lt.s -o cxx11-ios_failure.o

cxx11-ios_failure-lt.s: Assembler messages:
cxx11-ios_failure-lt.s:38443: Error: file number less than one

This is the related code:

.Ldebug_line0:
    .file 0 
"/tmp/sh/b-gcc-git-aarch64-rtems6/aarch64-rtems6/libstdc++-v3/src/c++11" 
"/home/EB/sebastian_h/src/gcc/libstdc++-v3/src/c++11/cxx11-ios_failure.cc"

    .section    .debug_str,"MS",@progbits,1

I am not sure if this is related to the change:

commit 3804e937b0e252a7e42632fe6d9f898f1851a49c
Author: Mark Wielaard 
AuthorDate: Tue Sep 29 15:52:44 2020 +0200
Commit: Mark Wielaard 
CommitDate: Sun Jan 17 01:36:39 2021 +0100

    Default to DWARF5

    gcc/ChangeLog:

    * common.opt (gdwarf-): Init(5).
    * doc/invoke.texi (-gdwarf): Document default to 5.

--
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/



Re: [PATCH] [PR rtl/optimization/98694] Fix incorrect optimization by cprop_hardreg.

2021-01-18 Thread Richard Sandiford via Gcc-patches
Hongtao Liu via Gcc-patches  writes:
> Hi:
>   If SRC had been assigned a mode narrower than the copy, we can't link
> DEST into the chain even they have same
> hard_regno_nregs(i.e. HImode/SImode in i386 backend).

In general, changes between modes within the same hard register are OK.
Could you explain in more detail what's going wrong?

Thanks,
Richard


>
> i.e
> kmovw   %k0, %edi
> vmovd   %edi, %xmm2
> vpshuflw$0, %xmm2, %xmm0
> kmovw   %k0, %r8d
> kmovd   %k0, %r9d
> ...
> -movl %r9d, %r11d
> +vmovd %xmm2, %r11d
>
>   Bootstrap and regtested on x86_64-linux-gnu{-m32,}.
>   Ok for trunk?
>
> gcc/ChangeLog:
>
> PR rtl-optimization/98694
> * regcprop.c (copy_value): If SRC had been assigned a mode
> narrower than the copy, we can't link DEST into the chain even
> they have same hard_regno_nregs(i.e. HImode/SImode in i386
> backend).
>
> gcc/testsuite/ChangeLog:
>
> PR rtl-optimization/98694
> * gcc.target/i386/pr98694.c: New test.
>
>   ---
>  gcc/regcprop.c  |  3 +-
>  gcc/testsuite/gcc.target/i386/pr98694.c | 38 +
>  2 files changed, 40 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr98694.c
>
> diff --git a/gcc/regcprop.c b/gcc/regcprop.c
> index dd62cb36013..997516eca07 100644
> --- a/gcc/regcprop.c
> +++ b/gcc/regcprop.c
> @@ -355,7 +355,8 @@ copy_value (rtx dest, rtx src, struct value_data *vd)
>/* If SRC had been assigned a mode narrower than the copy, we can't
>   link DEST into the chain, because not all of the pieces of the
>   copy came from oldest_regno.  */
> -  else if (sn > hard_regno_nregs (sr, vd->e[sr].mode))
> +  else if (sn > hard_regno_nregs (sr, vd->e[sr].mode)
> +  || partial_subreg_p (vd->e[sr].mode, GET_MODE (src)))
>  return;
>
>/* Link DR at the end of the value chain used by SR.  */
> diff --git a/gcc/testsuite/gcc.target/i386/pr98694.c
> b/gcc/testsuite/gcc.target/i386/pr98694.c
> new file mode 100644
> index 000..611f9e77627
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr98694.c
> @@ -0,0 +1,38 @@
> +/* PR rtl-optimization/98694 */
> +/* { dg-do run { target { ! ia32 } } } */
> +/* { dg-options "-O2 -mavx512bw" } */
> +/* { dg-require-effective-target avx512bw } */
> +
> +#include
> +typedef short v4hi __attribute__ ((vector_size (8)));
> +typedef int v2si __attribute__ ((vector_size (8)));
> +v4hi b;
> +
> +__attribute__ ((noipa))
> +v2si
> +foo (__m512i src1, __m512i src2)
> +{
> +  __mmask64 m = _mm512_cmpeq_epu8_mask (src1, src2);
> +  short s = (short) m;
> +  int i = (int)m;
> +  b = __extension__ (v4hi) {s, s, s, s};
> +  return __extension__ (v2si) {i, i};
> +}
> +
> +int main ()
> +{
> +  __m512i src1 = _mm512_setzero_si512 ();
> +  __m512i src2 = _mm512_set_epi8 (0, 1, 0, 1, 0, 1, 0, 1,
> + 0, 1, 0, 1, 0, 1, 0, 1,
> + 0, 1, 0, 1, 0, 1, 0, 1,
> + 0, 1, 0, 1, 0, 1, 0, 1,
> + 0, 1, 0, 1, 0, 1, 0, 1,
> + 0, 1, 0, 1, 0, 1, 0, 1,
> + 0, 1, 0, 1, 0, 1, 0, 1,
> + 0, 1, 0, 1, 0, 1, 0, 1);
> +  __mmask64 m = _mm512_cmpeq_epu8_mask (src1, src2);
> +  v2si a = foo (src1, src2);
> +  if (a[0] != (int)m)
> +__builtin_abort ();
> +  return 0;
> +}
> -- 


[PATCH] libstdc++: Add workaround for as Error: file number less than one error [PR98708]

2021-01-18 Thread Jakub Jelinek via Gcc-patches
Hi!

As mentioned in the PR, since the switch to DWARF5 by default instead of
DWARF4, gcc fails to build when configured against recent binutils.

The problem is that cxx11-ios_failure* is built in separate steps,
-S compilation (with -g -O2) followed by some sed and followed by
-c -g -O2 -g0 assembly.  When gcc is configured against recent binutils
and DWARF5 is the default, we emit .file 0 "..." directive on which the
assembler then fails (unless --gdwarf-5 is passed to it, but we don't want
that generally because on the other side older assemblers don't like -g*
passed to it when invoked on *.s file with compiler generated debug info.

I hope the bug will be fixed soon on the binutils side, but it would be nice
to have a workaround.

The following patch is one of the possibilities, another one is to do that
but add configure check for whether it is needed,
essentially
echo 'int main () { return 0; }' > conftest.c
${CXX} ${CXXFLAGS} -g -O2 -S conftest.c -o conftest.s
${CXX} ${CXXFLAGS} -g -O2 -g0 -c conftest.s -o conftest.o
and if the last command fails, we need that -gno-as-loc-support.
Or yet another option would be I think do a different check, whether
${CXX} ${CXXFLAGS} -g -O2 -S conftest.c -o conftest.s
${CXX} ${CXXFLAGS} -g -O2 -c conftest.s -o conftest.o
works and if yes, don't add the -g0 to cxx11-ios_failure*.s assembly.

I've successfully bootstrapped/regtested this version on x86_64-linux and
i686-linux.

2021-01-18  Jakub Jelinek  

PR debug/98708
* src/c++11/Makefile.am (cxx11-ios_failure-lt.s, cxx11-ios_failure.s):
Compile with -gno-as-loc-support.
* src/c++11/Makefile.in: Regenerated.

--- libstdc++-v3/src/c++11/Makefile.am.jj   2021-01-04 10:26:02.067970728 
+0100
+++ libstdc++-v3/src/c++11/Makefile.am  2021-01-17 17:20:58.580262364 +0100
@@ -141,12 +141,12 @@ if ENABLE_DUAL_ABI
 rewrite_ios_failure_typeinfo = sed -e 
'/^_*_ZTISt13__ios_failure:/,/_ZTVN10__cxxabiv120__si_class_type_infoE/s/_ZTVN10__cxxabiv120__si_class_type_infoE/_ZTVSt19__iosfail_type_info/'
 
 cxx11-ios_failure-lt.s: cxx11-ios_failure.cc
-   $(LTCXXCOMPILE) -S $< -o tmp-cxx11-ios_failure-lt.s
+   $(LTCXXCOMPILE) -gno-as-loc-support -S $< -o tmp-cxx11-ios_failure-lt.s
-test -f tmp-cxx11-ios_failure-lt.o && mv -f tmp-cxx11-ios_failure-lt.o 
tmp-cxx11-ios_failure-lt.s
$(rewrite_ios_failure_typeinfo) tmp-$@ > $@
-rm -f tmp-$@
 cxx11-ios_failure.s: cxx11-ios_failure.cc
-   $(CXXCOMPILE) -S $< -o tmp-$@
+   $(CXXCOMPILE) -gno-as-loc-support -S $< -o tmp-$@
$(rewrite_ios_failure_typeinfo) tmp-$@ > $@
-rm -f tmp-$@
 
--- libstdc++-v3/src/c++11/Makefile.in.jj   2020-12-17 02:29:28.734557483 
+0100
+++ libstdc++-v3/src/c++11/Makefile.in  2021-01-17 17:21:27.510931383 +0100
@@ -852,12 +852,12 @@ limits.o: limits.cc
$(CXXCOMPILE) -fchar8_t -c $<
 
 @ENABLE_DUAL_ABI_TRUE@cxx11-ios_failure-lt.s: cxx11-ios_failure.cc
-@ENABLE_DUAL_ABI_TRUE@ $(LTCXXCOMPILE) -S $< -o tmp-cxx11-ios_failure-lt.s
+@ENABLE_DUAL_ABI_TRUE@ $(LTCXXCOMPILE) -gno-as-loc-support -S $< -o 
tmp-cxx11-ios_failure-lt.s
 @ENABLE_DUAL_ABI_TRUE@ -test -f tmp-cxx11-ios_failure-lt.o && mv -f 
tmp-cxx11-ios_failure-lt.o tmp-cxx11-ios_failure-lt.s
 @ENABLE_DUAL_ABI_TRUE@ $(rewrite_ios_failure_typeinfo) tmp-$@ > $@
 @ENABLE_DUAL_ABI_TRUE@ -rm -f tmp-$@
 @ENABLE_DUAL_ABI_TRUE@cxx11-ios_failure.s: cxx11-ios_failure.cc
-@ENABLE_DUAL_ABI_TRUE@ $(CXXCOMPILE) -S $< -o tmp-$@
+@ENABLE_DUAL_ABI_TRUE@ $(CXXCOMPILE) -gno-as-loc-support -S $< -o tmp-$@
 @ENABLE_DUAL_ABI_TRUE@ $(rewrite_ios_failure_typeinfo) tmp-$@ > $@
 @ENABLE_DUAL_ABI_TRUE@ -rm -f tmp-$@
 

Jakub



[PATCH] [PR rtl/optimization/98694] Fix incorrect optimization by cprop_hardreg.

2021-01-18 Thread Hongtao Liu via Gcc-patches
Hi:
  If SRC had been assigned a mode narrower than the copy, we can't link
DEST into the chain even they have same
hard_regno_nregs(i.e. HImode/SImode in i386 backend).

i.e
kmovw   %k0, %edi
vmovd   %edi, %xmm2
vpshuflw$0, %xmm2, %xmm0
kmovw   %k0, %r8d
kmovd   %k0, %r9d
...
-movl %r9d, %r11d
+vmovd %xmm2, %r11d

  Bootstrap and regtested on x86_64-linux-gnu{-m32,}.
  Ok for trunk?

gcc/ChangeLog:

PR rtl-optimization/98694
* regcprop.c (copy_value): If SRC had been assigned a mode
narrower than the copy, we can't link DEST into the chain even
they have same hard_regno_nregs(i.e. HImode/SImode in i386
backend).

gcc/testsuite/ChangeLog:

PR rtl-optimization/98694
* gcc.target/i386/pr98694.c: New test.

  ---
 gcc/regcprop.c  |  3 +-
 gcc/testsuite/gcc.target/i386/pr98694.c | 38 +
 2 files changed, 40 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr98694.c

diff --git a/gcc/regcprop.c b/gcc/regcprop.c
index dd62cb36013..997516eca07 100644
--- a/gcc/regcprop.c
+++ b/gcc/regcprop.c
@@ -355,7 +355,8 @@ copy_value (rtx dest, rtx src, struct value_data *vd)
   /* If SRC had been assigned a mode narrower than the copy, we can't
  link DEST into the chain, because not all of the pieces of the
  copy came from oldest_regno.  */
-  else if (sn > hard_regno_nregs (sr, vd->e[sr].mode))
+  else if (sn > hard_regno_nregs (sr, vd->e[sr].mode)
+  || partial_subreg_p (vd->e[sr].mode, GET_MODE (src)))
 return;

   /* Link DR at the end of the value chain used by SR.  */
diff --git a/gcc/testsuite/gcc.target/i386/pr98694.c
b/gcc/testsuite/gcc.target/i386/pr98694.c
new file mode 100644
index 000..611f9e77627
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr98694.c
@@ -0,0 +1,38 @@
+/* PR rtl-optimization/98694 */
+/* { dg-do run { target { ! ia32 } } } */
+/* { dg-options "-O2 -mavx512bw" } */
+/* { dg-require-effective-target avx512bw } */
+
+#include
+typedef short v4hi __attribute__ ((vector_size (8)));
+typedef int v2si __attribute__ ((vector_size (8)));
+v4hi b;
+
+__attribute__ ((noipa))
+v2si
+foo (__m512i src1, __m512i src2)
+{
+  __mmask64 m = _mm512_cmpeq_epu8_mask (src1, src2);
+  short s = (short) m;
+  int i = (int)m;
+  b = __extension__ (v4hi) {s, s, s, s};
+  return __extension__ (v2si) {i, i};
+}
+
+int main ()
+{
+  __m512i src1 = _mm512_setzero_si512 ();
+  __m512i src2 = _mm512_set_epi8 (0, 1, 0, 1, 0, 1, 0, 1,
+ 0, 1, 0, 1, 0, 1, 0, 1,
+ 0, 1, 0, 1, 0, 1, 0, 1,
+ 0, 1, 0, 1, 0, 1, 0, 1,
+ 0, 1, 0, 1, 0, 1, 0, 1,
+ 0, 1, 0, 1, 0, 1, 0, 1,
+ 0, 1, 0, 1, 0, 1, 0, 1,
+ 0, 1, 0, 1, 0, 1, 0, 1);
+  __mmask64 m = _mm512_cmpeq_epu8_mask (src1, src2);
+  v2si a = foo (src1, src2);
+  if (a[0] != (int)m)
+__builtin_abort ();
+  return 0;
+}
-- 


-- 
BR,
Hongtao


Re: [stage1][PATCH] Change semantics of -frecord-gcc-switches and add -frecord-gcc-switches-format.

2021-01-18 Thread Martin Liška

On 1/15/21 8:14 PM, Alexandre Oliva wrote:

Hello, Martin,

Our testing detected unexpected -dumpbase-ext options making to the
producer string.

I tracked it down to something weird that happened in this patch:

On Dec  4, 2020, Martin Liška  wrote:


+++ b/gcc/dwarf2out.c
-  case OPT_dumpbase:
-  case OPT_dumpbase_ext:
-  case OPT_dumpdir:



+++ b/gcc/opts.c
+  case OPT_dumpbase:
+  case OPT_dumpdir:


Assuming you didn't really mean to drop the option, the following patch
restores it to the exclusion list in the refactored gen_producer_string.


Hello.

Thank you for the fix. Yes, it was not an intentional change from my side.

Martin



Regstrapped on x86_64-linux-gnu, installing as obvious.


drop -dumpbase-ext from producer string

From: Alexandre Oliva 

The -dumpbase and -dumpdir options are excluded from the producer
string output in debug information, but -dumpbase-ext was not.  This
patch excludes it as well.


for  gcc/ChangeLog

* opts.c (gen_command_line_string): Exclude -dumpbase-ext.
---
  gcc/opts.c |1 +
  1 file changed, 1 insertion(+)

diff --git a/gcc/opts.c b/gcc/opts.c
index 527f0dde70685..437389b3de8e7 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -3284,6 +3284,7 @@ gen_command_line_string (cl_decoded_option *options,
case OPT_o:
case OPT_d:
case OPT_dumpbase:
+  case OPT_dumpbase_ext:
case OPT_dumpdir:
case OPT_quiet:
case OPT_version:






Re: [PATCH] alias: Fix offset checks involving section anchors [PR92294]

2021-01-18 Thread Richard Biener
On Mon, 18 Jan 2021, Jan Hubicka wrote:

> > This is a repost of:
> > 
> >   https://gcc.gnu.org/pipermail/gcc-patches/2020-February/539763.html
> > 
> > which was initially posted during stage 4.  (And yeah, I only just
> > missed stage 4 again.)
> > 
> > IMO it would be better to fix the bug directly (as the patch tries
> > to do) instead of wait for a more thorough redesign of this area.
> > See the end of:
> > 
> >   https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540002.html
> > 
> > for some stats.
> > 
> > Honza: Richard said he'd like your opinion on the patch.
> > 
> > 
> > memrefs_conflict_p has a slightly odd structure.  It first checks
> > whether two addresses based on SYMBOL_REFs refer to the same object,
> > with a tristate result:
> > 
> >   int cmp = compare_base_symbol_refs (x,y);
> > 
> > If the addresses do refer to the same object, we can use offset-based 
> > checks:
> > 
> >   /* If both decls are the same, decide by offsets.  */
> >   if (cmp == 1)
> > return offset_overlap_p (c, xsize, ysize);
> > 
> > But then, apart from the special case of forced address alignment,
> > we use an offset-based check even if we don't know whether the
> > addresses refer to the same object:
> > 
> >   /* Assume a potential overlap for symbolic addresses that went
> >  through alignment adjustments (i.e., that have negative
> >  sizes), because we can't know how far they are from each
> >  other.  */
> >   if (maybe_lt (xsize, 0) || maybe_lt (ysize, 0))
> > return -1;
> >   /* If decls are different or we know by offsets that there is no 
> > overlap,
> >  we win.  */
> >   if (!cmp || !offset_overlap_p (c, xsize, ysize))
> > return 0;
> > 
> > This somewhat contradicts:
> > 
> >   /* In general we assume that memory locations pointed to by different 
> > labels
> >  may overlap in undefined ways.  */
> 
> I suppose it is becuase the code above check for SYMBOL_REF and not
> label (that is probably about jumptables and constpool injected into
> text segment).
> 
> I assume this is also bit result of GCC not being very systematic about
> aliases.  Sometimes it assumes that two different symbols do not point
> to same object while in other cases it is worried about aliases.
> 
> I see that anchors are special since they point to "same object" with
> different offests.
> > 
> > at the end of compare_base_symbol_refs.  In other words, we're taking -1
> > to mean that either (a) the symbols are equal (via aliasing) or (b) the
> > references access non-overlapping objects.
> 
> I for symbol refs yes, I think so.
> > 
> > But even assuming that's true for normal symbols, it doesn't cope
> > correctly with section anchors.  If a symbol X at ANCHOR+OFFSET is
> > preemptible, either (a) X = ANCHOR+OFFSET (rather than the X = ANCHOR
> > assumed above) or (b) X and ANCHOR reference non-overlapping objects.
> > 
> > And an offset-based comparison makes no sense for an anchor symbol
> > vs. a bare symbol with no decl.  If the bare symbol is allowed to
> > alias other symbols then it can surely alias any symbol in the
> > anchor's block, so there are multiple anchor offsets that might
> > induce an alias.
> > 
> > This patch therefore replaces the current tristate:
> > 
> >   - known equal
> >   - known independent (two accesses can't alias)
> >   - equal or independent
> > 
> > with:
> > 
> >   - known distance apart
> >   - known independent (two accesses can't alias)
> >   - known distance apart or independent
> >   - don't know
> > 
> > For safety, the patch puts all bare symbols in the "don't know"
> > category.  If that turns out to be too conservative, we at least
> > need that behaviour for combinations involving a bare symbol
> > and a section anchor.  However, bare symbols should be relatively
> > rare these days.
> 
> Well, in tree-ssa code we do assume these to be either disjoint objects
> or equal (in decl_refs_may_alias_p that continues in case
> compare_base_decls is -1).  I am not sure if we win much by threating
> them differently on RTL level. I would preffer staying consistent here.

So that's because if an alias (via alias attribute) is not visible
then it can be assumed to not exist.  Which means the bug is that
with section anchors we do not know which variables can be refered
to via the specific anchor? (if there's something like a "specific"
anchor)  That looks like the actual defect to me?  I see we have
block_symbol and object_block which may have all the data needed
in case accesses are somehow well-constrained?

> Otheriwse the patch looks good to me.

So let's go with it?  It looks like for decl vs. section anchor we
can identify the offset of the decl in the anchor block and thus
determine a offset adjustment necessary to perform an offset based
check, no?

Richard.

> Honza
> > 
> > Retested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu.
> > OK to install?
> > 
> > Richard
> > 
> > 
> > gcc/
> >