date:20210217

Re: [PATCH v2 2/2] ada: add 128bit operation to MIPS N32 and N64

2021-02-17 Thread Arnaud Charlet

> For MIPS N64 and N32:
>   add GNATRTL_128BIT_PAIRS to LIBGNAT_TARGET_PAIRS
>   add GNATRTL_128BIT_OBJS to EXTRA_GNATRTL_NONTASKING_OBJS
> 
> gcc/ada/ChangeLog:
>   PR ada/98996
>   * Makefile.rtl (LIBGNAT_TARGET_PAIRS, EXTRA_GNATRTL_NONTASKING_OBJS)
>   : add 128Bit operation file to MIPS N64 and N32.

Note that the ChangeLog is generated automatically these days.

The change is OK, thanks.

> ---
>  gcc/ada/ChangeLog|  6 ++
>  gcc/ada/Makefile.rtl | 12 
>  2 files changed, 18 insertions(+)
> 
> diff --git a/gcc/ada/ChangeLog b/gcc/ada/ChangeLog
> index 52faefaa2ae..3565a32c5ac 100644
> --- a/gcc/ada/ChangeLog
> +++ b/gcc/ada/ChangeLog
> @@ -1,3 +1,9 @@
> +2021-02-18  YunQiang Su  
> +
> + PR ada/98996
> + * Makefile.rtl (LIBGNAT_TARGET_PAIRS, EXTRA_GNATRTL_NONTASKING_OBJS)
> + : add 128Bit operation file to MIPS N64 and N32.
> +
>  2021-02-12  Arnaud Charlet  
>  
>   * repinfo.ads, repinfo.adb (*SO_Ref*): Restore.
> diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
> index 35faf13ea46..d86eb8acbf3 100644
> --- a/gcc/ada/Makefile.rtl
> +++ b/gcc/ada/Makefile.rtl
> @@ -2311,6 +2311,18 @@ ifeq ($(strip $(filter-out mips% linux%,$(target_cpu) 
> $(target_os))),)
>s-tpopsp.adbsystem.ads  
> +  ifeq ($(strip $(filter-out mips64% mipsisa64%,$(target_cpu))),)
> +ifneq ($(strip $(MULTISUBDIR)),/32)
> +  LIBGNAT_TARGET_PAIRS += $(GNATRTL_128BIT_PAIRS)
> +  EXTRA_GNATRTL_NONTASKING_OBJS += $(GNATRTL_128BIT_OBJS)
> +endif
> +  else
> +ifeq ($(strip $(filter-out /64 /n32,$(MULTISUBDIR))),)
> +  LIBGNAT_TARGET_PAIRS += $(GNATRTL_128BIT_PAIRS)
> +  EXTRA_GNATRTL_NONTASKING_OBJS += $(GNATRTL_128BIT_OBJS)
> +endif
> +  endif
> +
>TOOLS_TARGET_PAIRS = indepsw.adb  
>EXTRA_GNATRTL_TASKING_OBJS=s-linux.o
> -- 
> 2.20.1

[PATCH v6] Add retain attribute to place symbols in SHF_GNU_RETAIN section

2021-02-17 Thread H.J. Lu via Gcc-patches

On Wed, Feb 17, 2021 at 12:50 PM H.J. Lu  wrote:
>
> On Wed, Feb 17, 2021 at 12:14 PM Jakub Jelinek  wrote:
> >
> > On Wed, Feb 17, 2021 at 12:04:34PM -0800, H.J. Lu wrote:> >
> > > +  /* For -fretain-used-symbol, a "used" attribute also implies "retain". 
> > >  */
> >
> > s/-symbol/-symbols/
>
> Fixed.
>
> > > +  if (flag_retain_used_symbols
> > > +  && attributes
> > > +  && (TREE_CODE (*node) == FUNCTION_DECL
> > > +   || (VAR_P (*node) && TREE_STATIC (*node))
> > > +   || (TREE_CODE (*node) == TYPE_DECL))
> >
> > What will "retain" do on TYPE_DECLs?  It only makes sense to me
> > for FUNCTION_DECL or non-automatic VAR_DECLs, unless for types
> > it means the types with that result in vars with those types automatically
> > getting "retain", but there is no code for that and I don't see "used"
> > handled that way.
>
> Fixed.
>
> > > +  if (SUPPORTS_SHF_GNU_RETAIN
> > > +  && (TREE_CODE (node) == FUNCTION_DECL
> > > +   || (VAR_P (node) && TREE_STATIC (node))
> > > +   || (TREE_CODE (node) == TYPE_DECL)))
> >
> > Likewise.
>
> Fixed.
>
> > > +;
> > > +  else
> > > +{
> > > +  warning (OPT_Wattributes, "%qE attribute ignored", name);
> > > +  *no_add_attrs = true;
> > > +}
> > > +
> > > +  return NULL_TREE;
> > > +}
> > > +
> > >  /* Handle a "externally_visible" attribute; arguments as in
> > > struct attribute_spec.handler.  */
> > >
> > > diff --git a/gcc/common.opt b/gcc/common.opt
> > > index c75dd36843e..70842481d4d 100644
> > > --- a/gcc/common.opt
> > > +++ b/gcc/common.opt
> > > @@ -2404,6 +2404,10 @@ frerun-loop-opt
> > >  Common Ignore
> > >  Does nothing.  Preserved for backward compatibility.
> > >
> > > +fretain-used-symbols
> > > +Common Var(flag_retain_used_symbols)
> > > +Make the used attribute to imply the retain attribute if supported.
> >
> > English is not my native language, but I think the "to " doesn't belong
> > here.
>
> Fixed.
>
> > > +@item -fretain-used-symbols
> > > +@opindex fretain-used-symbols
> > > +On systems with recent GNU assembler and linker, the compiler makes
> >
> > I think we should avoid recent here, new/old/recent etc. are relative terms.
> > Either use exact version (is it 2.36 or later) or say GNU assembler and
> > linker that supports the @code{.retain} directive or something similar.
>
> Fixed.
>
> > >flags |= SECTION_NAMED;
> > > -  if (SUPPORTS_SHF_GNU_RETAIN
> > > -  && decl != nullptr
> > > +  if (decl != nullptr
> > >&& DECL_P (decl)
> > > -  && DECL_PRESERVE_P (decl))
> > > +  && DECL_PRESERVE_P (decl)
> > > +  && lookup_attribute ("retain", DECL_ATTRIBUTES (decl)))
> > >  flags |= SECTION_RETAIN;
> > >if (*slot == NULL)
> > >  {
> >
> > I think none of the varasm.c code should use DECL_PRESERVE_P when it means
> > retain, just lookup_attribute.
>
> Fixed.
>
> > > @@ -487,7 +487,8 @@ resolve_unique_section (tree decl, int reloc 
> > > ATTRIBUTE_UNUSED,
> > >if (DECL_SECTION_NAME (decl) == NULL
> > >&& targetm_common.have_named_sections
> > >&& (flag_function_or_data_sections
> > > -   || (SUPPORTS_SHF_GNU_RETAIN && DECL_PRESERVE_P (decl))
> > > +   || (DECL_PRESERVE_P (decl)
> > > +   && lookup_attribute ("retain", DECL_ATTRIBUTES (decl)))
> > > || DECL_COMDAT_GROUP (decl)))
> > >  {
> > >targetm.asm_out.unique_section (decl, reloc);
> > > @@ -1227,7 +1228,8 @@ get_variable_section (tree decl, bool 
> > > prefer_noswitch_p)
> > >  vnode->get_constructor ();
> > >
> > >if (DECL_COMMON (decl)
> > > -  && !(SUPPORTS_SHF_GNU_RETAIN && DECL_PRESERVE_P (decl)))
> > > +  && !(DECL_PRESERVE_P (decl)
> > > +&& lookup_attribute ("retain", DECL_ATTRIBUTES (decl
> > >  {
> > >/* If the decl has been given an explicit section name, or it 
> > > resides
> > >in a non-generic address space, then it isn't common, and shouldn't
> > > @@ -7761,12 +7763,14 @@ switch_to_section (section *new_section, tree 
> > > decl)
> > >  {
> > >if (in_section == new_section)
> > >  {
> > > -  if (SUPPORTS_SHF_GNU_RETAIN
> > > -   && (new_section->common.flags & SECTION_NAMED)
> > > +  if ((new_section->common.flags & SECTION_NAMED)
> > > && decl != nullptr
> > > && DECL_P (decl)
> > > && (!!DECL_PRESERVE_P (decl)
> > > -   != !!(new_section->common.flags & SECTION_RETAIN)))
> > > +   != !!(new_section->common.flags & SECTION_RETAIN))
> > > +   && (lookup_attribute ("retain",
> > > + DECL_ATTRIBUTES (new_section->named.decl))
> > > +   || lookup_attribute ("retain", DECL_ATTRIBUTES (decl
> > >   {
> > > /* If the SECTION_RETAIN bit doesn't match, switch to a new
> > >section.  */
> > > --
> > > 2.29.2
> > >
>
> Here is the updated patch.
>

Here is the updated patch.   Tested on Linux/x86-64 with binutils
master branch.   OK for master?

Thanks.

-- 
H.J.
From 8c3

[PATCH v2 2/2] ada: add 128bit operation to MIPS N32 and N64

2021-02-17 Thread YunQiang Su

For MIPS N64 and N32:
  add GNATRTL_128BIT_PAIRS to LIBGNAT_TARGET_PAIRS
  add GNATRTL_128BIT_OBJS to EXTRA_GNATRTL_NONTASKING_OBJS

gcc/ada/ChangeLog:
PR ada/98996
* Makefile.rtl (LIBGNAT_TARGET_PAIRS, EXTRA_GNATRTL_NONTASKING_OBJS)
: add 128Bit operation file to MIPS N64 and N32.
---
 gcc/ada/ChangeLog|  6 ++
 gcc/ada/Makefile.rtl | 12 
 2 files changed, 18 insertions(+)

diff --git a/gcc/ada/ChangeLog b/gcc/ada/ChangeLog
index 52faefaa2ae..3565a32c5ac 100644
--- a/gcc/ada/ChangeLog
+++ b/gcc/ada/ChangeLog
@@ -1,3 +1,9 @@
+2021-02-18  YunQiang Su  
+
+   PR ada/98996
+   * Makefile.rtl (LIBGNAT_TARGET_PAIRS, EXTRA_GNATRTL_NONTASKING_OBJS)
+   : add 128Bit operation file to MIPS N64 and N32.
+
 2021-02-12  Arnaud Charlet  
 
* repinfo.ads, repinfo.adb (*SO_Ref*): Restore.
diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
index 35faf13ea46..d86eb8acbf3 100644
--- a/gcc/ada/Makefile.rtl
+++ b/gcc/ada/Makefile.rtl
@@ -2311,6 +2311,18 @@ ifeq ($(strip $(filter-out mips% linux%,$(target_cpu) 
$(target_os))),)
   s-tpopsp.adb

[PATCH v2 1/2] MIPS: unaligned load: use SImode for SUBREG if OK (PR98996)

2021-02-17 Thread YunQiang Su

It is found by ada s-pack96.adb ftbfs, due to 96bit load: 96 = 64 + 32.
While the 32bit pair of l r is mark as SUBREG, so they are
not in SImode, make it fail to find suitable insn.

gcc/ChangeLog:

   * config/mips/mips.c (mips_expand_ext_as_unaligned_load):
   If TARGET_64BIT and dest is SUBREG, we check the width, if it
   equal to SImode, we use SImode operation, just like what we are
   doing for REG one.
---
 gcc/ChangeLog  | 8 
 gcc/config/mips/mips.c | 2 +-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 0b3bdcee619..3c91791feac 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2021-02-18  YunQiang Su  
+
+   PR target/98996
+   * config/mips/mips.c (mips_expand_ext_as_unaligned_load):
+   If TARGET_64BIT and dest is SUBREG, we check the width, if it
+   equal to SImode, we use SImode operation, just like what we are
+   doing for REG one.
+
 2021-02-17  Julian Brown  
 
* gimplify.c (gimplify_scan_omp_clauses): Handle ATTACH_DETACH
diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 8bd2d29552e..e901d860c3d 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -8400,7 +8400,7 @@ mips_expand_ext_as_unaligned_load (rtx dest, rtx src, 
HOST_WIDE_INT width,
   /* If TARGET_64BIT, the destination of a 32-bit "extz" or "extzv" will
  be a DImode, create a new temp and emit a zero extend at the end.  */
   if (GET_MODE (dest) == DImode
-  && REG_P (dest)
+  && (REG_P (dest) || (SUBREG_P(dest) && !MEM_P(SUBREG_REG(dest
   && GET_MODE_BITSIZE (SImode) == width)
 {
   dest1 = dest;
-- 
2.20.1

Re: [PATCH 1/2] MIPS: unaligned load: use SImode for SUBREG if OK (PR98996)

2021-02-17 Thread YunQiang Su

Jeff Law  于2021年2月17日周三 上午3:16写道：
>
>
>
> On 2/14/21 6:33 PM, YunQiang Su wrote:
> > It is found by ada s-pack96.adb ftbfs, due to 96bit load: 96 = 64 + 32.
> > While the 32bit pair of l r is mark as SUBREG, so they are
> > not in SImode, make it fail to find suitable insn.
> >
> > gcc/ChangeLog:
> >
> >* config/mips/mips.c (mips_expand_ext_as_unaligned_load):
> >If TARGET_64BIT and dest is SUBREG, we check the width, if it
> >equal to SImode, we use SImode operation, just like what we are
> >doing for REG one.
> > ---
> >  gcc/ChangeLog  | 8 
> >  gcc/config/mips/mips.c | 2 +-
> >  2 files changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> > index ddf4c7f92d7..fb12eeb971d 100644
> > --- a/gcc/ChangeLog
> > +++ b/gcc/ChangeLog
> > @@ -1,3 +1,11 @@
> > +2021-02-15  YunQiang Su  
> > +
> > + PR target/98996
> > + * config/mips/mips.c (mips_expand_ext_as_unaligned_load):
> > + If TARGET_64BIT and dest is SUBREG, we check the width, if it
> > + equal to SImode, we use SImode operation, just like what we are
> > + doing for REG one.
> Do you need to do any checking on the contents of the SUBREG?  ie, do
> you need to know if you've got (subreg (reg)) vs (subreg (mem))
>

Yes. you are right, we'd better to check it.

> Similarly I'd expect you may need to look at the mode of the inner
> object.  You could have a true subreg (outer mode is smaller than inner
> mode), but you might also be able to have a paradoxical subreg (outer
> mode is larger than inner mode).
>

I don't think that it is needed, since we we make sure that the mode
of `dest' is DImode,
and the width is SImode.

> Jeff
>

PING [PATCH] rs6000: Use rldimi for vec init instead of shift + ior

2021-02-17 Thread Kewen.Lin via Gcc-patches

Hi,

I'd like to gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2021-February/564758.html

BR,
Kewen

on 2021/2/3 下午2:37, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> This patch merges the previously approved one[1] and its relied patch
> made by Segher here[2], it's to make unsigned int vector init go with
> rldimi to merge two integers instead of shift and ior.
> 
> Segher's patch in [2] is required to make the test case pass,
> otherwise the costing for new pseudo-to-pseudo copies and the folding
> with nonzero_bits in combine will make the rl*imi pattern become
> compact and split into ior and shift unexpectedly.
> 
> The commit log of Segher's patch describes it in more details:
> 
> "An rl*imi is usually written as an IOR of an ASHIFT or similar, and an
> AND of a register with a constant mask.  In some cases combine knows
> that that AND doesn't do anything (because all zero bits in that mask
> correspond to bits known to be already zero), and then no pattern
> matches.  This patch adds a define_split for such cases.  It uses
> nonzero_bits in the condition of the splitter, but does not need it
> afterwards for the instruction to be recognised.  This is necessary
> because later passes can see fewer nonzero_bits.
> 
> Because it is a splitter, combine will only use it when starting with
> three insns (or more), even though the result is just one.  This isn't
> a huge problem in practice, but some possible combinations still won't
> happen."
> 
> Bootstrapped/regtested on powerpc64le-linux-gnu P9 and
> powerpc64-linux-gnu P8, also SPEC2017 build/run passed on P9.
> 
> Is it ok for trunk?
> 
> BR,
> Kewen
> 
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562407.html
> [2] https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563526.html
> 
> 
> gcc/ChangeLog:
> 
> 2020-02-03  Segher Boessenkool  
>   Kewen Lin  
> 
>   * config/rs6000/rs6000.md (*rotl3_insert_3): Renamed to...
>   (rotl3_insert_3): ...this.
>   (plus_ior_xor): New code_iterator.
>   (define_split for GPR rl*imi): New splitter.
>   * config/rs6000/vsx.md (vsx_init_v4si): Use gen_rotldi3_insert_3
>   for integer merging.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/powerpc/vec-init-10.c: New test.
> 
> -
>

Ping: [PATCH] rs6000: Convert the vector element register to SImode [PR98914]

2021-02-17 Thread Xionghu Luo via Gcc-patches


Gentle ping, thanks.


On 2021/2/3 17:01, Xionghu Luo wrote:

v[k] will also be expanded to IFN VEC_SET if k is long type when built
with -Og.  -O0 didn't exposed the issue due to v is TREE_ADDRESSABLE,
-O1 and above also didn't capture it because of v[k] is not optimized to
VIEW_CONVERT_EXPR(v)[k_1].
vec_insert defines the element argument type to be signed int by ELFv2
ABI, so convert it to SImode if it wasn't for Power target requirements.

gcc/ChangeLog:

2021-02-03  Xionghu Luo  

* config/rs6000/rs6000.c (rs6000_expand_vector_set): Convert
elt_rtx to SImode if it wasn't.

gcc/testsuite/ChangeLog:

2021-02-03  Xionghu Luo  

* gcc.target/powerpc/pr98914.c: New test.
---
  gcc/config/rs6000/rs6000.c | 17 ++---
  gcc/testsuite/gcc.target/powerpc/pr98914.c | 11 +++
  2 files changed, 21 insertions(+), 7 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr98914.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ec068c58aa5..9f7f8da56c6 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -7000,8 +7000,6 @@ rs6000_expand_vector_set_var_p9 (rtx target, rtx val, rtx 
idx)
  
gcc_assert (VECTOR_MEM_VSX_P (mode) && !CONST_INT_P (idx));
  
-  gcc_assert (GET_MODE (idx) == E_SImode);

-
machine_mode inner_mode = GET_MODE (val);
  
rtx tmp = gen_reg_rtx (GET_MODE (idx));

@@ -7047,8 +7045,6 @@ rs6000_expand_vector_set_var_p8 (rtx target, rtx val, rtx 
idx)
  
gcc_assert (VECTOR_MEM_VSX_P (mode) && !CONST_INT_P (idx));
  
-  gcc_assert (GET_MODE (idx) == E_SImode);

-
machine_mode inner_mode = GET_MODE (val);
HOST_WIDE_INT mode_mask = GET_MODE_MASK (inner_mode);
  
@@ -7144,7 +7140,7 @@ rs6000_expand_vector_set (rtx target, rtx val, rtx elt_rtx)

machine_mode mode = GET_MODE (target);
machine_mode inner_mode = GET_MODE_INNER (mode);
rtx reg = gen_reg_rtx (mode);
-  rtx mask, mem, x;
+  rtx mask, mem, x, elt_si;
int width = GET_MODE_SIZE (inner_mode);
int i;
  
@@ -7154,16 +7150,23 @@ rs6000_expand_vector_set (rtx target, rtx val, rtx elt_rtx)

  {
if (!CONST_INT_P (elt_rtx))
{
+ /* elt_rtx should be SImode from ELFv2 ABI.  */
+ elt_si = gen_reg_rtx (E_SImode);
+ if (GET_MODE (elt_rtx) != E_SImode)
+   convert_move (elt_si, elt_rtx, 0);
+ else
+   elt_si = elt_rtx;
+
  /* For V2DI/V2DF, could leverage the P9 version to generate xxpermdi
 when elt_rtx is variable.  */
  if ((TARGET_P9_VECTOR && TARGET_POWERPC64) || width == 8)
{
- rs6000_expand_vector_set_var_p9 (target, val, elt_rtx);
+ rs6000_expand_vector_set_var_p9 (target, val, elt_si);
  return;
}
  else if (TARGET_P8_VECTOR && TARGET_DIRECT_MOVE_64BIT)
{
- rs6000_expand_vector_set_var_p8 (target, val, elt_rtx);
+ rs6000_expand_vector_set_var_p8 (target, val, elt_si);
  return;
}
}
diff --git a/gcc/testsuite/gcc.target/powerpc/pr98914.c 
b/gcc/testsuite/gcc.target/powerpc/pr98914.c
new file mode 100644
index 000..e4d78e3e6b3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr98914.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-Og -mvsx" } */
+
+vector int
+foo (vector int v)
+{
+  for (long k = 0; k < 1; ++k)
+v[k] = 0;
+  return v;
+}



--
Thanks,
Xionghu

Re: Patch for PR analyzer/98797

2021-02-17 Thread David Malcolm via Gcc-patches

On Wed, 2021-02-17 at 23:18 +, brian.sobulefsky wrote:
> Hi David,
> 
> I'm sorry but I wanted to get this out to you. To clarify, I had a
> statement of the form:
> 
> end_offset <= covering_field->field_decl.bit_offset->int_cst.val[0]
>     covering_field->decl_common.size->int_cst.val[0] - 1;
> 
> (Sorry if my email is clobbering the angle brackets). I have replaced
> the first
> expression with int_bit_position (covering_field), I am not sure
> where to properly
> access the size of the field. FWIW, I found your
> region::get_byte_size, which
> calls int_size_in_bytes, but this crashes gcc for a field tree, it
> wants a
> type tree.

BTW, gcc/stor-layout.{c|h} holds the compiler's logic for laying out
the fields of structs and unions, determining bit-offsets and sizes. 
place_field is called repeatedly per field, and handles things like
sizes, alignments, etc.

I think you want DECL_SIZE() on the FIELD_DECL, which is typically a
INTEGER_CST (but won't necessarily be, consider VLAs, trailing empty
arrays, etc).  Quoting tree.h:

/* Holds the size of the datum, in bits, as a tree expression.
   Need not be constant and may be null.  May be less than TYPE_SIZE
   for a C++ FIELD_DECL representing a base class subobject with its
   own virtual base classes (which are laid out separately).  */
#define DECL_SIZE(NODE) (DECL_COMMON_CHECK (NODE)->decl_common.size)

> Additionally, is there some proper way to access a bit_offset_t other
> than
> bit_off.get_val ()[0]? That is what I am using now, but I can swap it
> out.

bit_offset_t is an offset_int, which is defined in gcc/wide-int.h
though I confess I find that header file difficult to understand. 
There are some overloaded operators and member functions for working
with them; see that header.  That said, looking at the analyzer
sources, I've typically been converting them to HOST_WIDE_INT via
to_shwi ().  I suspect the way I'm doing that is buggy when dealing
with very large types.

> Sorry for the newbie questions, but these things aren't really
> documented
> in one place, at least that I am aware of.

Fair enough, and don't worry, we were all new once.  There is some
documentation in
  https://gcc.gnu.org/onlinedocs/gccint/index.html
e.g.:
  https://gcc.gnu.org/onlinedocs/gccint/GENERIC.html
but it's a lot to wade through and is mostly just as an API reference
rather than a tutorial.

Hope this is helpful
Dave

Re: [PATCH] PR 99133, Mark xxspltiw, xxspltidp, and xxsplti32x as being prefixed

2021-02-17 Thread Segher Boessenkool

Hi!

On Wed, Feb 17, 2021 at 12:17:30PM -0500, Michael Meissner wrote:
> I noticed that the power10 xxspltiw, xxspltidp, and xxsplti32dx
> instructions are not flagged as prefixed instructions, which means the
> instruction length is not set to 12 bytes.  This patch sets these
> instructions to be prefixed.  It also ensures that a leading 'p' is not
> emitted before the instruction.
> 
> I checked this patch by doing a bootstrap build/check on a little endian 
> power8
> server system.  There were no regressions.

Why test a p10 patch on a p8?

> In addition, I debugged cc1, and I
> put a breakpoint on the get_attr_length function and I verified the insns now
> have length 12.

You can just use -dp; the generated assembler output has lines like
pla 3,.LC0@pcrel # 6[c=4 l=12]  *pcrel_local_addr
(c is cost, l is length).

  fprintf (asm_out_file, "[c=%d",
   insn_cost (debug_insn, optimize_insn_for_speed_p ()));
  if (HAVE_ATTR_length)
fprintf (asm_out_file, " l=%d",
 get_attr_length (debug_insn));
  fprintf (asm_out_file, "]  ");

> +   Some prefixed instructions (xxspltiw, xxspltidp, xxsplti32dp, etc.) do not
> +   have a leading 'p'.  Setting the prefix attribute to special does not the 
> 'p'
> +   prefix.  */

(grammar)

"special" is the *normal* case.  *Most* prefixed insns will be like
this.  They aren't right now, but they will be.

It sounds like you should make a new attribute, "always_prefixed" or
something, that then the code that sets "prefixed" can use.

>  void
>  rs6000_final_prescan_insn (rtx_insn *insn, rtx [], int)
>  {
> -  next_insn_prefixed_p = (get_attr_prefixed (insn) != PREFIXED_NO);
> +  next_insn_prefixed_p = (get_attr_prefixed (insn) != PREFIXED_NO
> +   && get_attr_prefixed (insn) != PREFIXED_SPECIAL);
>return;
>  }

So you set next_insn_prefixed_p when exactly?  The original code was
correct, as far as I can see?

There are four kinds of insns now:

1) Never prefixed.
2) Always prefixed, like xxspltiw but many more in the future.
3) Sometimes prefixed, and they get a "p" mnemonic prefix then.  Those
are the majority currently, since we mostly have load/store insns that
are prefixed currently, and those usually have a non-prefixed form we
handled already.
4) Sometimes prefixed, and they get a "pm" prefix then.  We don't handle
those yet (those are the MMA GER insns).

The "prefixed" attribute should just mean if the instruction ended up as
some prefixed form (8 bytes).

So, for insns like xxspltiw you should just set "prefixed" to "yes",
because that makes sense, is not confusing.  The code that prefixes "p"
to the mnemonic should change, instead.  It can look at some new
attribute, but it could also just use
  prefixed_load_p (insn) || prefixed_store_p (insn)
  || prefixed_paddi_p (insn)
or similar (perhaps make a helper function for that then?)

Thanks,

Segher

[PATCH] match.pd: Restrict clz cmp 0 replacement by single_use, PR99142

2021-02-17 Thread Hans-Peter Nilsson via Gcc-patches

If we're not going to eliminate the clz, it's better for the
comparison to use that result than its input, so we don't
extend the lifetime of the input.  Also, an additional use
of the result is more likely cheaper than a compare of the
input, in particular considering that the clz may have made
available a non-zero condition matching the original use.
The "s" modifier doesn't stop this situation, as the
transformation wouldn't result in "an expression with more
than one operator"; a gating single_use condition on the
result is necessary.

Regtested cris-elf and x86_64-linux.
Ok to commit?

gcc:
PR tree-optimization/99142
* match.pd (clz cmp 0): Gate replacement on single_use of clz result.

gcc/testsuite:
PR tree-optimization/99142
* testsuite/gcc.dg/tree-ssa/pr99142.c: New test.
---
 gcc/match.pd|  4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/pr99142.c | 14 ++
 2 files changed, 16 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr99142.c

diff --git a/gcc/match.pd b/gcc/match.pd
index e14f69744d7e..760f773cf1b1 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -6315,8 +6315,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (for op (eq ne)
   cmp (lt ge)
   (simplify
-   (op (clz:s @0) INTEGER_CST@1)
-   (if (integer_zerop (@1))
+   (op (clz:s@2 @0) INTEGER_CST@1)
+   (if (integer_zerop (@1) && single_use (@2))
 /* clz(X) == 0 is (int)X < 0 and clz(X) != 0 is (int)X >= 0.  */
 (with { tree stype = signed_type_for (TREE_TYPE (@0));
HOST_WIDE_INT val = 0;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr99142.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr99142.c
new file mode 100644
index ..1781a89de32a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr99142.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-not " >= 0\\)" "optimized" } } */
+int f(int a, int *b, int *d)
+{
+  int c = __builtin_clz(a);
+
+  *b = c;
+
+  if (c != 0)
+*d = c;
+
+  return c;
+}
-- 
2.11.0

Re: Patch for PR analyzer/98797

2021-02-17 Thread brian.sobulefsky via Gcc-patches

Hi David,

I'm sorry but I wanted to get this out to you. To clarify, I had a statement of 
the form:

end_offset <= covering_field->field_decl.bit_offset->int_cst.val[0]
covering_field->decl_common.size->int_cst.val[0] - 1;

(Sorry if my email is clobbering the angle brackets). I have replaced the first
expression with int_bit_position (covering_field), I am not sure where to 
properly
access the size of the field. FWIW, I found your region::get_byte_size, which
calls int_size_in_bytes, but this crashes gcc for a field tree, it wants a
type tree.

Additionally, is there some proper way to access a bit_offset_t other than
bit_off.get_val ()[0]? That is what I am using now, but I can swap it out.

Sorry for the newbie questions, but these things aren't really documented
in one place, at least that I am aware of.


Brian

[committed] testsuite: add regression test for PR analyzer/94596

2021-02-17 Thread David Malcolm via Gcc-patches

This use-after-free false positive affected GCC 10, but seems to be
fixed in trunk for GCC 11; adding a reduced version as a regression
test.

Successfully regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r11-7272-g963aecff2473080d748b2fc1ea2e32cef36cab11.

gcc/testsuite/ChangeLog:
PR analyzer/94596
* gcc.dg/analyzer/pr94596.c: New test.
---
 gcc/testsuite/gcc.dg/analyzer/pr94596.c | 97 +
 1 file changed, 97 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr94596.c

diff --git a/gcc/testsuite/gcc.dg/analyzer/pr94596.c 
b/gcc/testsuite/gcc.dg/analyzer/pr94596.c
new file mode 100644
index 000..055d2098064
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr94596.c
@@ -0,0 +1,97 @@
+/* Minimized/hacked up from openvswitch lib/conntrack.c, which had this license
+   header:  */
+/*
+ * Copyright (c) 2015-2019 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+typedef __SIZE_TYPE__ size_t;
+#define NULL ((void *)0)
+#define false 0
+
+#define OBJECT_OFFSETOF(OBJECT, MEMBER)\
+__builtin_offsetof(typeof(*(OBJECT)), MEMBER)
+
+#define OBJECT_CONTAINING(POINTER, OBJECT, MEMBER)  \
+((typeof(OBJECT)) (void *)  \
+ ((char *) (POINTER) - OBJECT_OFFSETOF(OBJECT, MEMBER)))
+
+#define ASSIGN_CONTAINER(OBJECT, POINTER, MEMBER)  \
+((OBJECT) = OBJECT_CONTAINING(POINTER, OBJECT, MEMBER), (void) 0)
+
+#define INIT_CONTAINER(OBJECT, POINTER, MEMBER)
\
+((OBJECT) = NULL, ASSIGN_CONTAINER(OBJECT, POINTER, MEMBER))
+
+#define HMAP_FOR_EACH_POP(NODE, MEMBER, HMAP)  \
+for (size_t bucket__ = 0;   \
+ INIT_CONTAINER(NODE, hmap_pop_helper__(HMAP, &bucket__), MEMBER),  \
+ (NODE != OBJECT_CONTAINING(NULL, NODE, MEMBER))\
+ || ((NODE = NULL), false);)
+
+struct hmap {
+struct hmap_node **buckets;
+struct hmap_node *one;
+size_t mask;
+size_t n;
+};
+
+struct hmap_node {
+size_t hash;
+struct hmap_node *next;
+};
+
+static inline void hmap_remove(struct hmap *, struct hmap_node *);
+
+struct hmap_node *
+hmap_pop_helper__(struct hmap *hmap, size_t *bucket) {
+
+for (; *bucket <= hmap->mask; (*bucket)++) {
+struct hmap_node *node = hmap->buckets[*bucket];
+
+if (node) {
+hmap_remove(hmap, node);
+return node;
+}
+}
+
+return NULL;
+}
+
+static inline void
+hmap_remove(struct hmap *hmap, struct hmap_node *node)
+{
+struct hmap_node **bucket = &hmap->buckets[node->hash & hmap->mask];
+while (*bucket != node) {
+bucket = &(*bucket)->next;
+}
+*bucket = node->next;
+hmap->n--;
+}
+
+struct conntrack {
+struct hmap zone_limits;
+};
+
+struct zone_limit {
+struct hmap_node node;
+};
+
+void
+conntrack_destroy(struct conntrack *ct)
+{
+struct zone_limit *zl;
+HMAP_FOR_EACH_POP (zl, node, &ct->zone_limits) {
+  __builtin_free(zl);
+}
+}
-- 
2.26.2

[r11-7271 Regression] FAIL: g++.dg/modules/pr99023_a.H (test for excess errors) on Linux/x86_64

2021-02-17 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

d8889c99aab4b599aa7ceb7079e69a9766171336 is the first bad commit
commit d8889c99aab4b599aa7ceb7079e69a9766171336
Author: Nathan Sidwell 
Date:   Wed Feb 17 10:43:21 2021 -0800

c++: Macros need to be GTY-reachable [PR 99023]

caused

FAIL: g++.dg/modules/pr99023_a.H module-cmi ,/pr99023_a.H 
(gcm.cache/,/pr99023_a.H.gcm)
FAIL: g++.dg/modules/pr99023_a.H (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-7271/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="modules.exp=g++.dg/modules/pr99023_a.H 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="modules.exp=g++.dg/modules/pr99023_a.H --target_board='unix{-m32\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

Re: [Patch] Fortran: Fix rank of assumed-rank array [PR99043]

2021-02-17 Thread Thomas Koenig via Gcc-patches




Hi Tobias,


OK for mainline?
Reported against GCC 10, not a regression but simple wrong-code fix;
does it make sense to apply there was well?


OK for both.

Thanks for the patch!

Best regards

Thomas

Re: [Patch] Fortran: Fix ubound simplifcation [PR99027]

2021-02-17 Thread Thomas Koenig via Gcc-patches


Hi Tobias,


The problem which accessing dim= for an expression is that
when the argument is an array, dim= and expr->rank are relative
to the resulting array – but the array-ref also contains the
DIMEN_ELEMENT which does not count for dim=/rank.

OK for the trunk? (Reported against GCC 11, but I wonder whether
we should still apply it to GCC 10.)


OK, also OK for gcc 10 (a clear wrong-code bug with a
clear solution is fine, IMHO).

Best regards

Thomas

Re: [PATCH] array-bounds: Fix up ICE on overaligned variables [PR99109]

2021-02-17 Thread Jakub Jelinek via Gcc-patches

On Wed, Feb 17, 2021 at 02:38:04PM -0700, Martin Sebor wrote:
> How does build_printable_array_type sound?

I'll go with that.

> Also, would using TYPE_MAIN_VARIANT whenever TYPE_USER_ALIGN is set
> be a simpler solution?  (It might not be as refined as the test in
> your patch but I don't think we'd be giving up too much by
> the simplification.)

That seems like a bad idea.  TYPE_USER_ALIGN is set quite often,
but in most cases the type size is a multiple of the alignment.

Jakub

Re: [PATCH] array-bounds: Fix up ICE on overaligned variables [PR99109]

2021-02-17 Thread Martin Sebor via Gcc-patches


On 2/17/21 1:56 PM, Jakub Jelinek wrote:

On Wed, Feb 17, 2021 at 01:38:56PM -0700, Martin Sebor wrote:

-   reftype = build_array_type_nelts (reftype, 1);
+   {
+ if (overaligned_type_p (reftype))
+   reftype = TYPE_MAIN_VARIANT (reftype);
+ reftype = build_array_type_nelts (reftype, 1);
+   }


Rather than complicating the logic in the caller (which is already
long and hard to follow) I'd suggest to consider changing
the build_zero_elt_array_type() helper to handle both kinds of arrays
(i.e., zero-length and otherwise), and strip the excess alignment
from reftype in it.  That will simplify the function.


That will mean the overaligned type checking will need to be done
even for the case where reftype was originally an ARRAY_TYPE and the code
just picks up an element type out of it (in that case it is known that it
will succeed).
And there is a question how to name it, build_array_type_nelts is already
taken and it might be confusing what this almost like build_array_type_nelts
but with extras differs from the tree.c function.

Anyway, can change it if you can suggest a good name...


How does build_printable_array_type sound?

Also, would using TYPE_MAIN_VARIANT whenever TYPE_USER_ALIGN is set
be a simpler solution?  (It might not be as refined as the test in
your patch but I don't think we'd be giving up too much by
the simplification.)

Martin

Re: [PATCH] handle VLA of zero length arrays and vice versa (PR 99121)

2021-02-17 Thread Martin Sebor via Gcc-patches


On 2/17/21 1:47 PM, Jakub Jelinek wrote:

On Wed, Feb 17, 2021 at 01:27:55PM -0700, Martin Sebor wrote:

Not in this patch, but I've looked at what maxobjsize is and wonder why
the roundtrip tree -> HOST_WIDE_INT -> offset_int:
   const offset_int maxobjsize = tree_to_shwi (max_object_size ());
Can't it be
   const offset_int maxobjsize = wi::to_offset (max_object_size ());
?


Yes, that's what it is elsewhere so this should do the same.  I've
changed it.




I'm pretty sure that's because wide_int doesn't have division and
I assumed offset_int didn't either when I originally wrote the code.
I've changed it to use division.


wide_int does have division, otherwise offset_int wouldn't have it either.
One needs to choose if one wants signed or unsigned division, operator /
does signed, one can use wi::{,s,u}div_{trunc,ceil,round} etc.
As maxobjsize shouldn't have MSB set, it probably doesn't matter if one
uses signed or unsigned division.

+ tree nelts = array_type_nelts (reftype);
+ if (integer_all_onesp (nelts))
+   /* Zero length array.  */
+   arrbounds[1] = 0;


Ok then.


+ else
{
- tree bnds[] = { TYPE_MIN_VALUE (dom), TYPE_MAX_VALUE (dom) };
- if (TREE_CODE (arg) == COMPONENT_REF)
-   {
- offset_int size = maxobjsize;
- if (tree fldsize = component_ref_size (arg))
-   size = wi::to_offset (fldsize);
- arrbounds[1] = wi::lrshift (size, eltsizelog2);
-   }
- else if (array_at_struct_end_p (arg) || !bnds[0] || !bnds[1])
-   arrbounds[1] = wi::lrshift (maxobjsize, eltsizelog2);
+ tree esz = TYPE_SIZE_UNIT (TREE_TYPE (reftype));
+ if (TREE_CODE (esz) == INTEGER_CST)
+   /* Array element is either not a VLA or it's a VLA with
+  zero size (such as int A[n][n][0];).  */
+   eltsize = wi::to_offset (esz);
  else
-   arrbounds[1] = (wi::to_offset (bnds[1]) - wi::to_offset 
(bnds[0])
-   + 1) * eltsize;
+   return false;
+
+ if (TREE_CODE (nelts) == INTEGER_CST)
+   arrbounds[1] = (wi::to_offset (nelts) + 1) * eltsize;
+ else if (eltsize == 0)
+   arrbounds[1] = 0;


Doesn't arrbounds[1] == 0 mean there will be warnings for any accesses?
For eltsize == 0 I think you shouldn't warn when nelts isn't known,
instead of always warning, arr[1] will have the same address as
arr[0] ...


This branch is entered for VLAs of zero-length arrays where we want
to warn, like this:

void f (void*);

void g (int n)
{
  int a[n][0];
  ((int*)a)[0] = 0;
  f (a);
}

Martin



Otherwise LGTM.

Jakub

Re: [[PATCH] 1/3] aarch64: Sanitize access to cfun in aarch64_declare_function_name()

2021-02-17 Thread Andrew Pinski via Gcc-patches

On Wed, Dec 9, 2020 at 9:21 AM Christoph Müllner
 wrote:
>
> From: Christoph Muellner 
>
> It is possible to call aarch64_declare_function_name() and
> have cfun not set. Let's sanitize the access to this variable.
>
> gcc/
>
> * config/aarch64/aarch64.c (aarch64_declare_function_name):
> Santize access to cfun.
> ---
>  gcc/config/aarch64/aarch64.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 67ffba02d3e..264ccb8beb2 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -19317,7 +19317,8 @@ aarch64_declare_function_name (FILE *stream, const 
> char* name,
>ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "function");
>ASM_OUTPUT_LABEL (stream, name);
>
> -  cfun->machine->label_is_assembled = true;
> +  if (cfun)
> +cfun->machine->label_is_assembled = true;
>  }


Hmm, looking through the usage of ASM_DECLARE_FUNCTION_NAME, cfun has
to be set, and usually that is set when current_function_decl is also
set.
Can you figure out why cfun is not set but current_function_decl is set?

Thanks,
Andrew Pinski

>
>  /* Implement PRINT_PATCHABLE_FUNCTION_ENTRY.  Check if the patch area is 
> after
> --
> 2.29.2
>

Re: [PATCH] array-bounds: Fix up ICE on overaligned variables [PR99109]

2021-02-17 Thread Jakub Jelinek via Gcc-patches

On Wed, Feb 17, 2021 at 01:38:56PM -0700, Martin Sebor wrote:
> > -   reftype = build_array_type_nelts (reftype, 1);
> > +   {
> > + if (overaligned_type_p (reftype))
> > +   reftype = TYPE_MAIN_VARIANT (reftype);
> > + reftype = build_array_type_nelts (reftype, 1);
> > +   }
> 
> Rather than complicating the logic in the caller (which is already
> long and hard to follow) I'd suggest to consider changing
> the build_zero_elt_array_type() helper to handle both kinds of arrays
> (i.e., zero-length and otherwise), and strip the excess alignment
> from reftype in it.  That will simplify the function.

That will mean the overaligned type checking will need to be done
even for the case where reftype was originally an ARRAY_TYPE and the code
just picks up an element type out of it (in that case it is known that it
will succeed).
And there is a question how to name it, build_array_type_nelts is already
taken and it might be confusing what this almost like build_array_type_nelts
but with extras differs from the tree.c function.

Anyway, can change it if you can suggest a good name...

Jakub

[PATCH v5] Add retain attribute to place symbols in SHF_GNU_RETAIN section

2021-02-17 Thread H.J. Lu via Gcc-patches

On Wed, Feb 17, 2021 at 12:14 PM Jakub Jelinek  wrote:
>
> On Wed, Feb 17, 2021 at 12:04:34PM -0800, H.J. Lu wrote:> >
> > +  /* For -fretain-used-symbol, a "used" attribute also implies "retain".  
> > */
>
> s/-symbol/-symbols/

Fixed.

> > +  if (flag_retain_used_symbols
> > +  && attributes
> > +  && (TREE_CODE (*node) == FUNCTION_DECL
> > +   || (VAR_P (*node) && TREE_STATIC (*node))
> > +   || (TREE_CODE (*node) == TYPE_DECL))
>
> What will "retain" do on TYPE_DECLs?  It only makes sense to me
> for FUNCTION_DECL or non-automatic VAR_DECLs, unless for types
> it means the types with that result in vars with those types automatically
> getting "retain", but there is no code for that and I don't see "used"
> handled that way.

Fixed.

> > +  if (SUPPORTS_SHF_GNU_RETAIN
> > +  && (TREE_CODE (node) == FUNCTION_DECL
> > +   || (VAR_P (node) && TREE_STATIC (node))
> > +   || (TREE_CODE (node) == TYPE_DECL)))
>
> Likewise.

Fixed.

> > +;
> > +  else
> > +{
> > +  warning (OPT_Wattributes, "%qE attribute ignored", name);
> > +  *no_add_attrs = true;
> > +}
> > +
> > +  return NULL_TREE;
> > +}
> > +
> >  /* Handle a "externally_visible" attribute; arguments as in
> > struct attribute_spec.handler.  */
> >
> > diff --git a/gcc/common.opt b/gcc/common.opt
> > index c75dd36843e..70842481d4d 100644
> > --- a/gcc/common.opt
> > +++ b/gcc/common.opt
> > @@ -2404,6 +2404,10 @@ frerun-loop-opt
> >  Common Ignore
> >  Does nothing.  Preserved for backward compatibility.
> >
> > +fretain-used-symbols
> > +Common Var(flag_retain_used_symbols)
> > +Make the used attribute to imply the retain attribute if supported.
>
> English is not my native language, but I think the "to " doesn't belong
> here.

Fixed.

> > +@item -fretain-used-symbols
> > +@opindex fretain-used-symbols
> > +On systems with recent GNU assembler and linker, the compiler makes
>
> I think we should avoid recent here, new/old/recent etc. are relative terms.
> Either use exact version (is it 2.36 or later) or say GNU assembler and
> linker that supports the @code{.retain} directive or something similar.

Fixed.

> >flags |= SECTION_NAMED;
> > -  if (SUPPORTS_SHF_GNU_RETAIN
> > -  && decl != nullptr
> > +  if (decl != nullptr
> >&& DECL_P (decl)
> > -  && DECL_PRESERVE_P (decl))
> > +  && DECL_PRESERVE_P (decl)
> > +  && lookup_attribute ("retain", DECL_ATTRIBUTES (decl)))
> >  flags |= SECTION_RETAIN;
> >if (*slot == NULL)
> >  {
>
> I think none of the varasm.c code should use DECL_PRESERVE_P when it means
> retain, just lookup_attribute.

Fixed.

> > @@ -487,7 +487,8 @@ resolve_unique_section (tree decl, int reloc 
> > ATTRIBUTE_UNUSED,
> >if (DECL_SECTION_NAME (decl) == NULL
> >&& targetm_common.have_named_sections
> >&& (flag_function_or_data_sections
> > -   || (SUPPORTS_SHF_GNU_RETAIN && DECL_PRESERVE_P (decl))
> > +   || (DECL_PRESERVE_P (decl)
> > +   && lookup_attribute ("retain", DECL_ATTRIBUTES (decl)))
> > || DECL_COMDAT_GROUP (decl)))
> >  {
> >targetm.asm_out.unique_section (decl, reloc);
> > @@ -1227,7 +1228,8 @@ get_variable_section (tree decl, bool 
> > prefer_noswitch_p)
> >  vnode->get_constructor ();
> >
> >if (DECL_COMMON (decl)
> > -  && !(SUPPORTS_SHF_GNU_RETAIN && DECL_PRESERVE_P (decl)))
> > +  && !(DECL_PRESERVE_P (decl)
> > +&& lookup_attribute ("retain", DECL_ATTRIBUTES (decl
> >  {
> >/* If the decl has been given an explicit section name, or it resides
> >in a non-generic address space, then it isn't common, and shouldn't
> > @@ -7761,12 +7763,14 @@ switch_to_section (section *new_section, tree decl)
> >  {
> >if (in_section == new_section)
> >  {
> > -  if (SUPPORTS_SHF_GNU_RETAIN
> > -   && (new_section->common.flags & SECTION_NAMED)
> > +  if ((new_section->common.flags & SECTION_NAMED)
> > && decl != nullptr
> > && DECL_P (decl)
> > && (!!DECL_PRESERVE_P (decl)
> > -   != !!(new_section->common.flags & SECTION_RETAIN)))
> > +   != !!(new_section->common.flags & SECTION_RETAIN))
> > +   && (lookup_attribute ("retain",
> > + DECL_ATTRIBUTES (new_section->named.decl))
> > +   || lookup_attribute ("retain", DECL_ATTRIBUTES (decl
> >   {
> > /* If the SECTION_RETAIN bit doesn't match, switch to a new
> >section.  */
> > --
> > 2.29.2
> >

Here is the updated patch.

-- 
H.J.
From 28017e20ba70140f812934c91673ba4e7b4493c4 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 15 Feb 2021 11:31:12 -0800
Subject: [PATCH v5] Add retain attribute to place symbols in SHF_GNU_RETAIN
 section

When building Linux kernel, ld in bninutils 2.36 with GCC 11 generates
thousands of

ld: warning: orphan section `.data.event_initcall_finish' from `init/main.o' being placed in section `.data.event_initcall_finish'
ld: warning:

Re: [PATCH] handle VLA of zero length arrays and vice versa (PR 99121)

2021-02-17 Thread Jakub Jelinek via Gcc-patches

On Wed, Feb 17, 2021 at 01:27:55PM -0700, Martin Sebor wrote:

Not in this patch, but I've looked at what maxobjsize is and wonder why
the roundtrip tree -> HOST_WIDE_INT -> offset_int:
  const offset_int maxobjsize = tree_to_shwi (max_object_size ());
Can't it be
  const offset_int maxobjsize = wi::to_offset (max_object_size ());
?

> I'm pretty sure that's because wide_int doesn't have division and
> I assumed offset_int didn't either when I originally wrote the code.
> I've changed it to use division.

wide_int does have division, otherwise offset_int wouldn't have it either.
One needs to choose if one wants signed or unsigned division, operator /
does signed, one can use wi::{,s,u}div_{trunc,ceil,round} etc.
As maxobjsize shouldn't have MSB set, it probably doesn't matter if one
uses signed or unsigned division.
> +   tree nelts = array_type_nelts (reftype);
> +   if (integer_all_onesp (nelts))
> + /* Zero length array.  */
> + arrbounds[1] = 0;

Ok then.

> +   else
>   {
> -   tree bnds[] = { TYPE_MIN_VALUE (dom), TYPE_MAX_VALUE (dom) };
> -   if (TREE_CODE (arg) == COMPONENT_REF)
> - {
> -   offset_int size = maxobjsize;
> -   if (tree fldsize = component_ref_size (arg))
> - size = wi::to_offset (fldsize);
> -   arrbounds[1] = wi::lrshift (size, eltsizelog2);
> - }
> -   else if (array_at_struct_end_p (arg) || !bnds[0] || !bnds[1])
> - arrbounds[1] = wi::lrshift (maxobjsize, eltsizelog2);
> +   tree esz = TYPE_SIZE_UNIT (TREE_TYPE (reftype));
> +   if (TREE_CODE (esz) == INTEGER_CST)
> + /* Array element is either not a VLA or it's a VLA with
> +zero size (such as int A[n][n][0];).  */
> + eltsize = wi::to_offset (esz);
> else
> - arrbounds[1] = (wi::to_offset (bnds[1]) - wi::to_offset 
> (bnds[0])
> - + 1) * eltsize;
> + return false;
> +
> +   if (TREE_CODE (nelts) == INTEGER_CST)
> + arrbounds[1] = (wi::to_offset (nelts) + 1) * eltsize;
> +   else if (eltsize == 0)
> + arrbounds[1] = 0;

Doesn't arrbounds[1] == 0 mean there will be warnings for any accesses?
For eltsize == 0 I think you shouldn't warn when nelts isn't known,
instead of always warning, arr[1] will have the same address as
arr[0] ...

Otherwise LGTM.

Jakub

Re: [PATCH] array-bounds: Fix up ICE on overaligned variables [PR99109]

2021-02-17 Thread Martin Sebor via Gcc-patches


On 2/17/21 3:12 AM, Jakub Jelinek via Gcc-patches wrote:

Hi!

check_mem_ref builds artificial arrays for variables that don't have
array type.
The C standard says:
"For the purposes of these operators, a pointer to an object that is not an 
element of an
array behaves the same as a pointer to the first element of an array of length 
one with the
type of the object as its element type."
so it isn't completely wrong and does simplify the function.
But, layout_type can fail if the size of the element type is not a multiple
of its alignment (i.e. overaligned types) and we then ICE because of that.

The following patch uses TYPE_MAIN_VARIANT in those cases instead, but only
for the types that need it, as for the diagnostics it is better to use the
typedef names etc. that were really used in the source if possible.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-02-17  Jakub Jelinek  

PR middle-end/99109
* gimple-array-bounds.cc (overaligned_type_p): New function.
(array_bounds_checker::check_mem_ref): For overaligned types use
TYPE_MAIN_VARIANT of reftype as element type of a new array.

* g++.dg/warn/Warray-bounds-17.C: New test.

--- gcc/gimple-array-bounds.cc.jj   2021-01-04 10:25:37.471249246 +0100
+++ gcc/gimple-array-bounds.cc  2021-02-16 17:00:41.961750114 +0100
@@ -386,6 +386,21 @@ build_zero_elt_array_type (tree eltype)
return arrtype;
  }
  
+/* Return true if it is not possible to build an array with element type

+   ELTYPE, because either ELTYPE has alignment larger than its size
+   or its size is not a multiple of its alignment.  */
+
+static bool
+overaligned_type_p (tree eltype)
+{
+  return (TYPE_SIZE_UNIT (eltype)
+ && TREE_CODE (TYPE_SIZE_UNIT (eltype)) == INTEGER_CST
+ && !integer_zerop (TYPE_SIZE_UNIT (eltype))
+ && TYPE_ALIGN_UNIT (eltype) > 1
+ && wi::zext (wi::to_wide (TYPE_SIZE_UNIT (eltype)),
+  ffs_hwi (TYPE_ALIGN_UNIT (eltype)) - 1) != 0);
+}
+
  /* Checks one MEM_REF in REF, located at LOCATION, for out-of-bounds
 references to string constants.  If VRP can determine that the array
 subscript is a constant, check if it is outside valid range.
@@ -553,6 +568,8 @@ array_bounds_checker::check_mem_ref (loc
reftype = TREE_TYPE (TREE_TYPE (arg));
if (TREE_CODE (reftype) == ARRAY_TYPE)
reftype = TREE_TYPE (reftype);
+  else if (overaligned_type_p (reftype))
+   reftype = TYPE_MAIN_VARIANT (reftype);
if (tree refsize = TYPE_SIZE_UNIT (reftype))
if (TREE_CODE (refsize) == INTEGER_CST)
  eltsize = wi::to_offset (refsize);
@@ -675,7 +692,11 @@ array_bounds_checker::check_mem_ref (loc
/* Treat a reference to a non-array object as one to an array
 of a single element.  */
if (TREE_CODE (reftype) != ARRAY_TYPE)
-   reftype = build_array_type_nelts (reftype, 1);
+   {
+ if (overaligned_type_p (reftype))
+   reftype = TYPE_MAIN_VARIANT (reftype);
+ reftype = build_array_type_nelts (reftype, 1);
+   }


Rather than complicating the logic in the caller (which is already
long and hard to follow) I'd suggest to consider changing
the build_zero_elt_array_type() helper to handle both kinds of arrays
(i.e., zero-length and otherwise), and strip the excess alignment
from reftype in it.  That will simplify the function.

I recall considering this (using a single helper) but not why I didn't
do it to begin with.

Martin

  
/* Extract the element type out of MEM_REF and use its size

 to compute the index to print in the diagnostic; arrays
--- gcc/testsuite/g++.dg/warn/Warray-bounds-17.C.jj 2021-02-16 
17:24:14.178813304 +0100
+++ gcc/testsuite/g++.dg/warn/Warray-bounds-17.C2021-02-16 
17:23:35.305251062 +0100
@@ -0,0 +1,15 @@
+// PR middle-end/99109
+// { dg-do compile }
+// { dg-options "-O2 -Warray-bounds" }
+
+typedef int A __attribute__((aligned (64)));
+void foo (int *);
+
+void
+bar (void)
+{
+  A b; // { dg-message "while referencing" }
+  int *p = &b;
+  int *x = (p - 1);// { dg-warning "outside array bounds" }
+  foo (x);
+}

Jakub

Re: [PATCH] handle VLA of zero length arrays and vice versa (PR 99121)

2021-02-17 Thread Martin Sebor via Gcc-patches


On 2/17/21 6:56 AM, Jakub Jelinek wrote:

On Tue, Feb 16, 2021 at 08:34:41PM -0700, Martin Sebor via Gcc-patches wrote:

+ if (integer_all_onesp (nelts))
+   /* Zero length array.  */
+   eltsize = 0;
+ else
{
- tree bnds[] = { TYPE_MIN_VALUE (dom), TYPE_MAX_VALUE (dom) };
- if (TREE_CODE (arg) == COMPONENT_REF)
-   {
- offset_int size = maxobjsize;
- if (tree fldsize = component_ref_size (arg))
-   size = wi::to_offset (fldsize);
- arrbounds[1] = wi::lrshift (size, eltsizelog2);
-   }
- else if (array_at_struct_end_p (arg) || !bnds[0] || !bnds[1])
-   arrbounds[1] = wi::lrshift (maxobjsize, eltsizelog2);
- else
-   arrbounds[1] = (wi::to_offset (bnds[1]) - wi::to_offset 
(bnds[0])
-   + 1) * eltsize;
+ tree esz = TYPE_SIZE_UNIT (TREE_TYPE (reftype));
+ if (TREE_CODE (esz) == INTEGER_CST)
+   /* Array element is not a VLA.  */
+   eltsize = wi::to_offset (esz);
}
+
+ if (!array_at_struct_end_p (arg)
+ && TREE_CODE (nelts) == INTEGER_CST)
+   arrbounds[1] = (wi::to_offset (nelts) + 1) * eltsize;
  else
-   arrbounds[1] = wi::lrshift (maxobjsize, eltsizelog2);
+   {
+ /* Use log2 of size to convert the array byte size in to its
+upper bound in elements.  */
+ const offset_int eltsizelog2 = wi::floor_log2 (eltsize);
+ arrbounds[1] = wi::lrshift (maxobjsize, eltsizelog2);


So, what will this do for zero length arrays at the end of struct?
eltsize will be 0 and wi::floor_log2 (eltsize) is -1, and shifting by -1,
while maybe not UB in wide_int computations, is certainly just weird.
Why do you use eltsize = 0 for the [0] arrays?  They can still have their
element size and if array_at_struct_end_p and the element type is not a variable
length type, using the actual eltsize seems better.
Only when !array_at_struct_end_p we should ensure arrbounds[1] will be 0
and indeed that (wi::to_offset (nelts) + 1) * eltsize would likely not do
that because wi::to_offset is -1ULL or so, not -1.


The code is only entered for references to declared objects so
the array_at_struct_end_p() test is unnecessary.  I've removed
it in the attached revision.



Also, I'm not sure I understand the right shift by floor_log2 of eltsize,
why can't you simply divide maxobjsize by eltsize (if eltsize is not 0).


I'm pretty sure that's because wide_int doesn't have division and
I assumed offset_int didn't either when I originally wrote the code.
I've changed it to use division.

Attached is a revised patch.

Martin
PR tree-optimization/99121 - ICE in -Warray-bounds on a VLA of zero-length array

gcc/ChangeLog:

	PR tree-optimization/99121
	* gimple-array-bounds.cc (array_bounds_checker::check_mem_ref):
	Avoid assuming array element size is constant.  Handle zero-length
	arrays of VLAs.

gcc/testsuite/ChangeLog:

	PR tree-optimization/99121
	* c-c++-common/Warray-bounds-9.c: New test.
	* gcc.dg/Warray-bounds-71.c: New test.

diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc
index 2576556f76b..046b78d463e 100644
--- a/gcc/gimple-array-bounds.cc
+++ b/gcc/gimple-array-bounds.cc
@@ -594,34 +594,31 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref,
 	  || (DECL_EXTERNAL (arg) && array_at_struct_end_p (ref
 	return false;
 
-  /* FIXME: Should this be 1 for Fortran?  */
   arrbounds[0] = 0;
 
   if (TREE_CODE (reftype) == ARRAY_TYPE)
 	{
-	  /* Set to the size of the array element (and adjust below).  */
-	  eltsize = wi::to_offset (TYPE_SIZE_UNIT (TREE_TYPE (reftype)));
-	  /* Use log2 of size to convert the array byte size in to its
-	 upper bound in elements.  */
-	  const offset_int eltsizelog2 = wi::floor_log2 (eltsize);
-	  if (tree dom = TYPE_DOMAIN (reftype))
+	  tree nelts = array_type_nelts (reftype);
+	  if (integer_all_onesp (nelts))
+	/* Zero length array.  */
+	arrbounds[1] = 0;
+	  else
 	{
-	  tree bnds[] = { TYPE_MIN_VALUE (dom), TYPE_MAX_VALUE (dom) };
-	  if (TREE_CODE (arg) == COMPONENT_REF)
-		{
-		  offset_int size = maxobjsize;
-		  if (tree fldsize = component_ref_size (arg))
-		size = wi::to_offset (fldsize);
-		  arrbounds[1] = wi::lrshift (size, eltsizelog2);
-		}
-	  else if (array_at_struct_end_p (arg) || !bnds[0] || !bnds[1])
-		arrbounds[1] = wi::lrshift (maxobjsize, eltsizelog2);
+	  tree esz = TYPE_SIZE_UNIT (TREE_TYPE (reftype));
+	  if (TREE_CODE (esz) == INTEGER_CST)
+		/* Array element is either not a VLA or it's a VLA with
+		   zero size (such as int A[n][n][0];).  */
+		eltsize = wi::to_offset (esz);
 	  else
-		arrbounds[1] = (wi::to_offset (bnds[1]) - wi::to_offset (bnds[0])
-+ 1) * eltsize;
+		return false;
+
+

Re: [PATCH v4] Add retain attribute to place symbols in SHF_GNU_RETAIN section

2021-02-17 Thread Jakub Jelinek via Gcc-patches

On Wed, Feb 17, 2021 at 12:04:34PM -0800, H.J. Lu wrote:> >
> +  /* For -fretain-used-symbol, a "used" attribute also implies "retain".  */

s/-symbol/-symbols/

> +  if (flag_retain_used_symbols
> +  && attributes
> +  && (TREE_CODE (*node) == FUNCTION_DECL
> +   || (VAR_P (*node) && TREE_STATIC (*node))
> +   || (TREE_CODE (*node) == TYPE_DECL))

What will "retain" do on TYPE_DECLs?  It only makes sense to me
for FUNCTION_DECL or non-automatic VAR_DECLs, unless for types
it means the types with that result in vars with those types automatically
getting "retain", but there is no code for that and I don't see "used"
handled that way.

> +  if (SUPPORTS_SHF_GNU_RETAIN
> +  && (TREE_CODE (node) == FUNCTION_DECL
> +   || (VAR_P (node) && TREE_STATIC (node))
> +   || (TREE_CODE (node) == TYPE_DECL)))

Likewise.
> +;
> +  else
> +{
> +  warning (OPT_Wattributes, "%qE attribute ignored", name);
> +  *no_add_attrs = true;
> +}
> +
> +  return NULL_TREE;
> +}
> +
>  /* Handle a "externally_visible" attribute; arguments as in
> struct attribute_spec.handler.  */
>  
> diff --git a/gcc/common.opt b/gcc/common.opt
> index c75dd36843e..70842481d4d 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2404,6 +2404,10 @@ frerun-loop-opt
>  Common Ignore
>  Does nothing.  Preserved for backward compatibility.
>  
> +fretain-used-symbols
> +Common Var(flag_retain_used_symbols)
> +Make the used attribute to imply the retain attribute if supported.

English is not my native language, but I think the "to " doesn't belong
here.

> +@item -fretain-used-symbols
> +@opindex fretain-used-symbols
> +On systems with recent GNU assembler and linker, the compiler makes

I think we should avoid recent here, new/old/recent etc. are relative terms.
Either use exact version (is it 2.36 or later) or say GNU assembler and
linker that supports the @code{.retain} directive or something similar.

>flags |= SECTION_NAMED;
> -  if (SUPPORTS_SHF_GNU_RETAIN
> -  && decl != nullptr
> +  if (decl != nullptr
>&& DECL_P (decl)
> -  && DECL_PRESERVE_P (decl))
> +  && DECL_PRESERVE_P (decl)
> +  && lookup_attribute ("retain", DECL_ATTRIBUTES (decl)))
>  flags |= SECTION_RETAIN;
>if (*slot == NULL)
>  {

I think none of the varasm.c code should use DECL_PRESERVE_P when it means
retain, just lookup_attribute.

> @@ -487,7 +487,8 @@ resolve_unique_section (tree decl, int reloc 
> ATTRIBUTE_UNUSED,
>if (DECL_SECTION_NAME (decl) == NULL
>&& targetm_common.have_named_sections
>&& (flag_function_or_data_sections
> -   || (SUPPORTS_SHF_GNU_RETAIN && DECL_PRESERVE_P (decl))
> +   || (DECL_PRESERVE_P (decl)
> +   && lookup_attribute ("retain", DECL_ATTRIBUTES (decl)))
> || DECL_COMDAT_GROUP (decl)))
>  {
>targetm.asm_out.unique_section (decl, reloc);
> @@ -1227,7 +1228,8 @@ get_variable_section (tree decl, bool prefer_noswitch_p)
>  vnode->get_constructor ();
>  
>if (DECL_COMMON (decl)
> -  && !(SUPPORTS_SHF_GNU_RETAIN && DECL_PRESERVE_P (decl)))
> +  && !(DECL_PRESERVE_P (decl)
> +&& lookup_attribute ("retain", DECL_ATTRIBUTES (decl
>  {
>/* If the decl has been given an explicit section name, or it resides
>in a non-generic address space, then it isn't common, and shouldn't
> @@ -7761,12 +7763,14 @@ switch_to_section (section *new_section, tree decl)
>  {
>if (in_section == new_section)
>  {
> -  if (SUPPORTS_SHF_GNU_RETAIN
> -   && (new_section->common.flags & SECTION_NAMED)
> +  if ((new_section->common.flags & SECTION_NAMED)
> && decl != nullptr
> && DECL_P (decl)
> && (!!DECL_PRESERVE_P (decl)
> -   != !!(new_section->common.flags & SECTION_RETAIN)))
> +   != !!(new_section->common.flags & SECTION_RETAIN))
> +   && (lookup_attribute ("retain",
> + DECL_ATTRIBUTES (new_section->named.decl))
> +   || lookup_attribute ("retain", DECL_ATTRIBUTES (decl
>   {
> /* If the SECTION_RETAIN bit doesn't match, switch to a new
>section.  */
> -- 
> 2.29.2
> 


Jakub

[PATCH v4] Add retain attribute to place symbols in SHF_GNU_RETAIN section

2021-02-17 Thread H.J. Lu via Gcc-patches

On Wed, Feb 17, 2021 at 10:57 AM Jakub Jelinek  wrote:
>
> On Wed, Feb 17, 2021 at 09:34:34AM -0800, H.J. Lu wrote:
> > -fretain-used-symols.
> > +/* Add the NAME attribute to *ANODE.  */
> > +
> > +static void
> > +add_attribute (tree *anode, int flags, tree name, tree args, tree ns,
> > +const bool cxx11_attr_p,
> > +const struct attribute_spec *spec)
>
> Why do you need this and can't instead do what decl_attributes does
> e.g. for "noipa" attribute?
>
>   /* A "noipa" function attribute implies "noinline", "noclone" and "no_icf"
>  for those targets that support it.  */
>   if (TREE_CODE (*node) == FUNCTION_DECL
>   && attributes
>   && lookup_attribute ("noipa", attributes) != NULL
>   && lookup_attribute_spec (get_identifier ("noipa")))
> {
>   if (lookup_attribute ("noinline", attributes) == NULL)
> attributes = tree_cons (get_identifier ("noinline"), NULL, 
> attributes);
>
>   if (lookup_attribute ("noclone", attributes) == NULL)
> attributes = tree_cons (get_identifier ("noclone"),  NULL, 
> attributes);
>
>   if (lookup_attribute ("no_icf", attributes) == NULL)
> attributes = tree_cons (get_identifier ("no_icf"),  NULL, attributes);
> }
>
> Of course, for "used" attributes and flag_retain_used_symbols it would need
> to be DECL_P, but otherwise pretty much the same thing.

Fixed.

> > --- a/gcc/common.opt
> > +++ b/gcc/common.opt
> > @@ -2404,6 +2404,10 @@ frerun-loop-opt
> >  Common Ignore
> >  Does nothing.  Preserved for backward compatibility.
> >
> > +fretain-used-symols
>
> s/symols/symbols/ - many times in the patch.

Fixed.

> > +@item retain
> > +@cindex @code{retain} function attribute
> > +This attribute is the same as the @code{used} attribute, except
> > +for ELF targets that support the GNU or FreeBSD OSABIs, this attribute
>
> I'm not sure "retain" attribute should imply "used", one can always use
> __attribute__((used, retain)) if one wants both.
> I think it should only imply the GC avoidance and nothing else.
>

Fixed.

Here is the updated patch.

-- 
H.J.
From 0521e87ee76cbcfc560289c78ae683764da0d964 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 15 Feb 2021 11:31:12 -0800
Subject: [PATCH v4] Add retain attribute to place symbols in SHF_GNU_RETAIN
 section

When building Linux kernel, ld in bninutils 2.36 with GCC 11 generates
thousands of

ld: warning: orphan section `.data.event_initcall_finish' from `init/main.o' being placed in section `.data.event_initcall_finish'
ld: warning: orphan section `.data.event_initcall_start' from `init/main.o' being placed in section `.data.event_initcall_start'
ld: warning: orphan section `.data.event_initcall_level' from `init/main.o' being placed in section `.data.event_initcall_level'

Since these sections are marked with SHF_GNU_RETAIN, they are placed in
separate sections.  They become orphan sections since they aren't expected
in the Linux kernel linker script. But orphan sections normally don't work
well with the Linux kernel linker script and the resulting kernel crashed.

1. Add the "retain" attribute to place symbols in separate SHF_GNU_RETAIN
sections.  Issue a warning if the configured assembler/linker doesn't
support SHF_GNU_RETAIN.
2. Add -fretain-used-symbols to treat symbols with the "used" attribute
the same as the @code{retain} attribute.

gcc/

	PR target/99113
	* attribs.c (decl_attributes): Imply "retain" attribute for
	"used" attribute when -fretain-used-symbols is used.
	* common.opt: Add -fretain-used-symbols.
	* toplev.c (process_options): Issue a warning for
	-fretain-used-symbols without SUPPORTS_SHF_GNU_RETAIN.
	* varasm.c (get_section): Replace SUPPORTS_SHF_GNU_RETAIN with
	looking up the retain attribute.
	(resolve_unique_section): Likewise.
	(get_variable_section): Likewise.
	(switch_to_section): Likewise.
	* doc/extend.texi: Document the "retain" attribute.
	* doc/invoke.texi: Document -fretain-used-symbols.

gcc/c-family/

	PR target/99113
	* c-attribs.c (c_common_attribute_table): Add the "retain"
	attribute.
	(handle_retain_attribute): New function.

gcc/testsuite/

	PR target/99113
	* c-c++-common/attr-retain-1.c: New test.
	* c-c++-common/attr-retain-2.c: Likewise.
	* c-c++-common/attr-retain-3.c: Likewise.
	* c-c++-common/attr-retain-4.c: Likewise.
	* c-c++-common/pr99113.c: Likewise.
	* gcc.c-torture/compile/attr-retain-1.c: Likewise.
	* gcc.c-torture/compile/attr-retain-2.c: Likewise.
	* c-c++-common/attr-used.c: Pass -fretain-used-symbols if supported.
	* c-c++-common/attr-used-2.c: Likewise.
	* c-c++-common/attr-used-3.c: Likewise.
	* c-c++-common/attr-used-4.c: Likewise.
	* c-c++-common/attr-used-5.c: Likewise.
	* c-c++-common/attr-used-6.c: Likewise.
	* c-c++-common/attr-used-7.c: Likewise.
	* c-c++-common/attr-used-8.c: Likewise.
	* c-c++-common/attr-used-9.c: Likewise.
	* gcc.c-torture/compile/attr-used-retain-1.c: Pass
	-fretain-used-symbols.
	* gcc.c-torture/compile/attr-used-retain-2.c: Likewise.
---
 gcc/

Re: [PATCH v3] Add retain attribute to place symbols in SHF_GNU_RETAIN section

2021-02-17 Thread Jakub Jelinek via Gcc-patches

On Wed, Feb 17, 2021 at 09:34:34AM -0800, H.J. Lu wrote:
> -fretain-used-symols.
> +/* Add the NAME attribute to *ANODE.  */
> +
> +static void
> +add_attribute (tree *anode, int flags, tree name, tree args, tree ns,
> +const bool cxx11_attr_p,
> +const struct attribute_spec *spec)

Why do you need this and can't instead do what decl_attributes does
e.g. for "noipa" attribute?

  /* A "noipa" function attribute implies "noinline", "noclone" and "no_icf"
 for those targets that support it.  */
  if (TREE_CODE (*node) == FUNCTION_DECL
  && attributes
  && lookup_attribute ("noipa", attributes) != NULL
  && lookup_attribute_spec (get_identifier ("noipa")))
{
  if (lookup_attribute ("noinline", attributes) == NULL)
attributes = tree_cons (get_identifier ("noinline"), NULL, attributes);

  if (lookup_attribute ("noclone", attributes) == NULL)
attributes = tree_cons (get_identifier ("noclone"),  NULL, attributes);

  if (lookup_attribute ("no_icf", attributes) == NULL)
attributes = tree_cons (get_identifier ("no_icf"),  NULL, attributes);
}

Of course, for "used" attributes and flag_retain_used_symbols it would need
to be DECL_P, but otherwise pretty much the same thing.
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2404,6 +2404,10 @@ frerun-loop-opt
>  Common Ignore
>  Does nothing.  Preserved for backward compatibility.
>  
> +fretain-used-symols

s/symols/symbols/ - many times in the patch.

> +@item retain
> +@cindex @code{retain} function attribute
> +This attribute is the same as the @code{used} attribute, except
> +for ELF targets that support the GNU or FreeBSD OSABIs, this attribute

I'm not sure "retain" attribute should imply "used", one can always use
__attribute__((used, retain)) if one wants both.
I think it should only imply the GC avoidance and nothing else.

Jakub

[PATCH v3] Add retain attribute to place symbols in SHF_GNU_RETAIN section

2021-02-17 Thread H.J. Lu via Gcc-patches

On Wed, Feb 17, 2021 at 6:26 AM Jakub Jelinek  wrote:
>
> On Tue, Feb 16, 2021 at 11:59:21AM -0800, H.J. Lu wrote:
> >   PR target/99113
> >   * common.opt: Add -fgnu-retain.
>
> I'm not sure -fgnu-retain as the option name.
> Wouldn't say -fretain-used-vars be better?

I used -fretain-used-symols for both function and variable.

> > @@ -1666,6 +1666,10 @@ floop-unroll-and-jam
> >  Common Var(flag_unroll_jam) Optimization
> >  Perform unroll-and-jam on loops.
> >
> > +fgnu-retain
> > +Common Var(flag_gnu_retain)
> > +Use SHF_GNU_RETAIN on used symbols if supported by the assembler and the 
> > linker.
>
> on variables with the used attribute?

Fixed.

> > diff --git a/gcc/toplev.c b/gcc/toplev.c
> > index d8cc254adef..119cd7c0432 100644
> > --- a/gcc/toplev.c
> > +++ b/gcc/toplev.c
> > @@ -1761,6 +1761,13 @@ process_options (void)
> >if (flag_large_source_files)
> >  line_table->default_range_bits = 0;
> >
> > +  if (flag_gnu_retain && !SUPPORTS_SHF_GNU_RETAIN)
> > +{
> > +  warning_at (UNKNOWN_LOCATION, 0, "%qs is not supported for this 
> > target",
> > +   "-fgnu-retain");
> > +  flag_gnu_retain = 0;
> > +}
> > +
> >/* Please don't change global_options after this point, those changes 
> > won't
> >   be reflected in optimization_{default,current}_node.  */
> >  }
> > diff --git a/gcc/varasm.c b/gcc/varasm.c
> > index 29478ab0d8d..4e0e30abee5 100644
> > --- a/gcc/varasm.c
> > +++ b/gcc/varasm.c
> > @@ -297,7 +297,7 @@ get_section (const char *name, unsigned int flags, tree 
> > decl,
> >slot = section_htab->find_slot_with_hash (name, htab_hash_string (name),
> >   INSERT);
> >flags |= SECTION_NAMED;
> > -  if (SUPPORTS_SHF_GNU_RETAIN
> > +  if (flag_gnu_retain
> >&& decl != nullptr
> >&& DECL_P (decl)
> >&& DECL_PRESERVE_P (decl))
> > @@ -487,7 +487,7 @@ resolve_unique_section (tree decl, int reloc 
> > ATTRIBUTE_UNUSED,
> >if (DECL_SECTION_NAME (decl) == NULL
> >&& targetm_common.have_named_sections
> >&& (flag_function_or_data_sections
> > -   || (SUPPORTS_SHF_GNU_RETAIN && DECL_PRESERVE_P (decl))
> > +   || (flag_gnu_retain && DECL_PRESERVE_P (decl))
> > || DECL_COMDAT_GROUP (decl)))
> >  {
> >targetm.asm_out.unique_section (decl, reloc);
>
> I'm not convinced this will work properly with LTO, the option from
> -flto -c compilation would be ignored.
> For functions, we usually mark the option Optimization and make it
> saved/restored on function switches, but this is for variables, so it will
> not work for those.
> I'd think better would be to add when seeing "used" attribute some
> artificial attribute (unless we have "retain" attribute to mean that, say
> "retain ") when the flag is on in handle_used_attribute and use that
> attribute in varasm instead of the option?

Fixed.

> > @@ -1227,7 +1227,7 @@ get_variable_section (tree decl, bool 
> > prefer_noswitch_p)
> >  vnode->get_constructor ();
> >
> >if (DECL_COMMON (decl)
> > -  && !(SUPPORTS_SHF_GNU_RETAIN && DECL_PRESERVE_P (decl)))
> > +  && !(flag_gnu_retain && DECL_PRESERVE_P (decl)))
> >  {
> >/* If the decl has been given an explicit section name, or it resides
> >in a non-generic address space, then it isn't common, and shouldn't
> > @@ -7761,7 +7761,7 @@ switch_to_section (section *new_section, tree decl)
> >  {
> >if (in_section == new_section)
> >  {
> > -  if (SUPPORTS_SHF_GNU_RETAIN
> > +  if (flag_gnu_retain
> > && (new_section->common.flags & SECTION_NAMED)
> > && decl != nullptr
> > && DECL_P (decl)
>
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -16168,6 +16168,11 @@ DSOs; if your program relies on reinitialization 
> > of a DSO via
> > @code{dlclose} and @code{dlopen}, you can use
> > @option{-fno-gnu-unique}.
> >
> > +@item -fgnu-retain
> > +@opindex fgnu-retain
> > +On systems with recent GNU assembler and linker, the compiler places
> > +used symbols in separate SHF_GNU_RETAIN sections.
>
> Again, variables with the used attribute.

Fixed.

Here is the updated patch to add retain attribute and
-fretain-used-symols.


-- 
H.J.
From 739912815dcbd4f0683cfcade9ad565d1f913667 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 15 Feb 2021 11:31:12 -0800
Subject: [PATCH v3] Add retain attribute to place symbols in SHF_GNU_RETAIN
 section

When building Linux kernel, ld in bninutils 2.36 with GCC 11 generates
thousands of

ld: warning: orphan section `.data.event_initcall_finish' from `init/main.o' being placed in section `.data.event_initcall_finish'
ld: warning: orphan section `.data.event_initcall_start' from `init/main.o' being placed in section `.data.event_initcall_start'
ld: warning: orphan section `.data.event_initcall_level' from `init/main.o' being placed in section `.data.event_initcall_level'

Since these sections are marked with SHF_GNU_RETAIN, they are p

[PATCH] PR 99133, Mark xxspltiw, xxspltidp, and xxsplti32x as being prefixed

2021-02-17 Thread Michael Meissner via Gcc-patches

PR 99133, Mark xxspltiw, xxspltidp, and xxsplti32x as being prefixed

I noticed that the power10 xxspltiw, xxspltidp, and xxsplti32dx
instructions are not flagged as prefixed instructions, which means the
instruction length is not set to 12 bytes.  This patch sets these
instructions to be prefixed.  It also ensures that a leading 'p' is not
emitted before the instruction.

I checked this patch by doing a bootstrap build/check on a little endian power8
server system.  There were no regressions.  In addition, I debugged cc1, and I
put a breakpoint on the get_attr_length function and I verified the insns now
have length 12.  Can I check this patch into the master branch?

GCC 10 does not have support for these instructions, so I will not need to do a
backport.

gcc/
2021-02-17  Michael Meissner  

PR target/99133
* config/rs6000/altivec.md (xxspltiw_v4si): Set prefixed
attribute.
(xxspltiw_v4sf): Set prefixed attribute.
(xxspltidp_v2df): Set prefixed attribute.
(xxsplti32dx_v4si): Set prefixed attribute.
(xxsplti32dx_v4si_inst): Set prefixed attribute.
(xxsplti32dx_v4sf): Set prefixed attribute.
* config/rs6000/rs6000.c (rs6000_insn_cost): Add support for
setting prefixed attribute to special.
(rs6000_final_prescan_insn): If prefixed attribute is special, do
not emit a leading 'p' before the instruction.
(rs6000_adjust_insn_length): Add support for setting prefixed
attribute to special.
* config/rs6000/rs6000.md (prefixed attribute): Add 'special' case
for prefixed instructions that do not use a leading 'p'.
---
 gcc/config/rs6000/altivec.md | 18 --
 gcc/config/rs6000/rs6000.c   | 13 +
 gcc/config/rs6000/rs6000.md  |  7 ++-
 3 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 27a269b9e72..d409acce3a9 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -826,7 +826,8 @@ (define_insn "xxspltiw_v4si"
 UNSPEC_XXSPLTIW))]
  "TARGET_POWER10"
  "xxspltiw %x0,%1"
- [(set_attr "type" "vecsimple")])
+ [(set_attr "type" "vecsimple")
+  (set_attr "prefixed" "special")])
 
 (define_expand "xxspltiw_v4sf"
   [(set (match_operand:V4SF 0 "register_operand" "=wa")
@@ -845,7 +846,8 @@ (define_insn "xxspltiw_v4sf_inst"
 UNSPEC_XXSPLTIW))]
  "TARGET_POWER10"
  "xxspltiw %x0,%1"
- [(set_attr "type" "vecsimple")])
+ [(set_attr "type" "vecsimple")
+  (set_attr "prefixed" "special")])
 
 (define_expand "xxspltidp_v2df"
   [(set (match_operand:V2DF 0 "register_operand" )
@@ -864,7 +866,8 @@ (define_insn "xxspltidp_v2df_inst"
 UNSPEC_XXSPLTID))]
   "TARGET_POWER10"
   "xxspltidp %x0,%1"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+  (set_attr "prefixed" "special")])
 
 (define_expand "xxsplti32dx_v4si"
   [(set (match_operand:V4SI 0 "register_operand" "=wa")
@@ -883,7 +886,8 @@ (define_expand "xxsplti32dx_v4si"
 GEN_INT (index), operands[3]));
DONE;
 }
- [(set_attr "type" "vecsimple")])
+ [(set_attr "type" "vecsimple")
+  (set_attr "prefixed" "special")])
 
 (define_insn "xxsplti32dx_v4si_inst"
   [(set (match_operand:V4SI 0 "register_operand" "=wa")
@@ -893,7 +897,8 @@ (define_insn "xxsplti32dx_v4si_inst"
 UNSPEC_XXSPLTI32DX))]
   "TARGET_POWER10"
   "xxsplti32dx %x0,%2,%3"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+  (set_attr "prefixed" "special")])
 
 (define_expand "xxsplti32dx_v4sf"
   [(set (match_operand:V4SF 0 "register_operand" "=wa")
@@ -921,7 +926,8 @@ (define_insn "xxsplti32dx_v4sf_inst"
 UNSPEC_XXSPLTI32DX))]
   "TARGET_POWER10"
   "xxsplti32dx %x0,%2,%3"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+  (set_attr "prefixed" "special")])
 
 (define_insn "xxblend_"
   [(set (match_operand:VM3 0 "register_operand" "=wa")
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 798a715005c..f01fda35459 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -22031,7 +22031,7 @@ rs6000_insn_cost (rtx_insn *insn, bool speed)
   if (n == 0)
 {
   int length = get_attr_length (insn);
-  if (get_attr_prefixed (insn) == PREFIXED_YES)
+  if (get_attr_prefixed (insn) != PREFIXED_NO)
{
  int adjust = 0;
  ADJUST_INSN_LENGTH (insn, adjust);
@@ -26212,11 +26212,16 @@ static bool next_insn_prefixed_p;
insn is a prefixed insn where we need to emit a 'p' before the insn.
 
In addition, if the insn is part of a PC-relative reference to an external
-   label optimization, this is recorded also.  */
+   label optimization, this is recorded also.
+
+   Some prefixed instructions (xxspltiw, xxspltidp, xxsplti32dp, etc.) do not
+   have a leading 'p'.  Setting the prefix attribute

[committed] analyzer: fix false leak involving params [PR98969]

2021-02-17 Thread David Malcolm via Gcc-patches

This patch updates the svalue liveness code so that the initial value
of parameters at top-level functions to the analysis are treated as
live (since the values are presumably still live within the
outside-of-the-analysis calling code).

This fixes the false leak in PR analyzer/98969 seen on:

void
test (long int i)
{
  struct foo *f = (struct foo *)i;
  f->expr = __builtin_malloc (1024);
}

since the calling code can presumably still access the allocated
buffer via:
  ((struct foo *)i)->expr

The patch also removes the expected leak warnings from
g++.dg/analyzer/pr99064.C and gcc.dg/analyzer/pr96841.c, which now
appear to me to be false positives.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r11-7270-ge0139b2a912585496f23c352f0e2c56895f78fbf.

gcc/analyzer/ChangeLog:
PR analyzer/98969
* constraint-manager.cc (dead_svalue_purger::should_purge_p):
Update for change to svalue::live_p.
* program-state.cc (sm_state_map::on_liveness_change): Likewise.
(program_state::detect_leaks): Likewise.
* region-model-reachability.cc (reachable_regions::init_cluster):
When dealing with a symbolic region, if the underlying pointer is
implicitly live, add the region to the reachable regions.
* region-model.cc (region_model::compare_initial_and_pointer):
Move logic for detecting initial values of params to
initial_svalue::initial_value_of_param_p.
* svalue.cc (svalue::live_p): Convert "live_svalues" from a
reference to a pointer; support it being NULL.
(svalue::implicitly_live_p): Convert first param from a
refererence to a pointer.
(region_svalue::implicitly_live_p): Likewise.
(constant_svalue::implicitly_live_p): Likewise.
(initial_svalue::implicitly_live_p): Likewise.  Treat the initial
values of params for the top level frame as still live.
(initial_svalue::initial_value_of_param_p): New function, taken
from a test in region_model::compare_initial_and_pointer.
(unaryop_svalue::implicitly_live_p): Convert first param from a
refererence to a pointer.
(binop_svalue::implicitly_live_p): Likewise.
(sub_svalue::implicitly_live_p): Likewise.
(unmergeable_svalue::implicitly_live_p): Likewise.
* svalue.h (svalue::live_p): Likewise.
(svalue::implicitly_live_p): Likewise.
(region_svalue::implicitly_live_p): Likewise.
(constant_svalue::implicitly_live_p): Likewise.
(initial_svalue::implicitly_live_p): Likewise.
(initial_svalue::initial_value_of_param_p): New decl.
(unaryop_svalue::implicitly_live_p): Convert first param from a
refererence to a pointer.
(binop_svalue::implicitly_live_p): Likewise.
(sub_svalue::implicitly_live_p): Likewise.
(unmergeable_svalue::implicitly_live_p): Likewise.

gcc/testsuite/ChangeLog:
PR analyzer/98969
* g++.dg/analyzer/pr99064.C: Convert dg-bogus to dg-warning.
* gcc.dg/analyzer/pr96841.c: Add -Wno-analyzer-too-complex to
options.  Remove false leak directive.
* gcc.dg/analyzer/pr98969.c (test_1): Remove xfail from leak
false positive.
(test_3): New.
---
 gcc/analyzer/constraint-manager.cc|  2 +-
 gcc/analyzer/program-state.cc |  4 +-
 gcc/analyzer/region-model-reachability.cc |  2 +
 gcc/analyzer/region-model.cc  | 14 +-
 gcc/analyzer/svalue.cc| 52 +--
 gcc/analyzer/svalue.h | 20 +
 gcc/testsuite/g++.dg/analyzer/pr99064.C   |  2 +-
 gcc/testsuite/gcc.dg/analyzer/pr96841.c   |  4 +-
 gcc/testsuite/gcc.dg/analyzer/pr98969.c   |  9 +++-
 9 files changed, 69 insertions(+), 40 deletions(-)

diff --git a/gcc/analyzer/constraint-manager.cc 
b/gcc/analyzer/constraint-manager.cc
index 23421080fba..4dadd200bee 100644
--- a/gcc/analyzer/constraint-manager.cc
+++ b/gcc/analyzer/constraint-manager.cc
@@ -1643,7 +1643,7 @@ public:
 
   bool should_purge_p (const svalue *sval) const
   {
-return !sval->live_p (m_live_svalues, m_model);
+return !sval->live_p (&m_live_svalues, m_model);
   }
 
 private:
diff --git a/gcc/analyzer/program-state.cc b/gcc/analyzer/program-state.cc
index 60fed56cd65..e427fff59d6 100644
--- a/gcc/analyzer/program-state.cc
+++ b/gcc/analyzer/program-state.cc
@@ -523,7 +523,7 @@ sm_state_map::on_liveness_change (const svalue_set 
&live_svalues,
++iter)
 {
   const svalue *iter_sval = (*iter).first;
-  if (!iter_sval->live_p (live_svalues, model))
+  if (!iter_sval->live_p (&live_svalues, model))
{
  svals_to_unset.add (iter_sval);
  entry_t e = (*iter).second;
@@ -1201,7 +1201,7 @@ program_state::detect_leaks (const program_state 
&src_state,
 live in DEST_STATE: either explicitly reachable, or implicitly
 live based on the set of explici

Re: [PATCH v2] Add -fgnu-retain to place used symbols in SHF_GNU_RETAIN section

2021-02-17 Thread Jakub Jelinek via Gcc-patches

On Tue, Feb 16, 2021 at 11:59:21AM -0800, H.J. Lu wrote:
>   PR target/99113
>   * common.opt: Add -fgnu-retain.

I'm not sure -fgnu-retain as the option name.
Wouldn't say -fretain-used-vars be better?

> @@ -1666,6 +1666,10 @@ floop-unroll-and-jam
>  Common Var(flag_unroll_jam) Optimization
>  Perform unroll-and-jam on loops.
>  
> +fgnu-retain
> +Common Var(flag_gnu_retain)
> +Use SHF_GNU_RETAIN on used symbols if supported by the assembler and the 
> linker.

on variables with the used attribute?

> diff --git a/gcc/toplev.c b/gcc/toplev.c
> index d8cc254adef..119cd7c0432 100644
> --- a/gcc/toplev.c
> +++ b/gcc/toplev.c
> @@ -1761,6 +1761,13 @@ process_options (void)
>if (flag_large_source_files)
>  line_table->default_range_bits = 0;
>  
> +  if (flag_gnu_retain && !SUPPORTS_SHF_GNU_RETAIN)
> +{
> +  warning_at (UNKNOWN_LOCATION, 0, "%qs is not supported for this 
> target",
> +   "-fgnu-retain");
> +  flag_gnu_retain = 0;
> +}
> +
>/* Please don't change global_options after this point, those changes won't
>   be reflected in optimization_{default,current}_node.  */
>  }
> diff --git a/gcc/varasm.c b/gcc/varasm.c
> index 29478ab0d8d..4e0e30abee5 100644
> --- a/gcc/varasm.c
> +++ b/gcc/varasm.c
> @@ -297,7 +297,7 @@ get_section (const char *name, unsigned int flags, tree 
> decl,
>slot = section_htab->find_slot_with_hash (name, htab_hash_string (name),
>   INSERT);
>flags |= SECTION_NAMED;
> -  if (SUPPORTS_SHF_GNU_RETAIN
> +  if (flag_gnu_retain
>&& decl != nullptr
>&& DECL_P (decl)
>&& DECL_PRESERVE_P (decl))
> @@ -487,7 +487,7 @@ resolve_unique_section (tree decl, int reloc 
> ATTRIBUTE_UNUSED,
>if (DECL_SECTION_NAME (decl) == NULL
>&& targetm_common.have_named_sections
>&& (flag_function_or_data_sections
> -   || (SUPPORTS_SHF_GNU_RETAIN && DECL_PRESERVE_P (decl))
> +   || (flag_gnu_retain && DECL_PRESERVE_P (decl))
> || DECL_COMDAT_GROUP (decl)))
>  {
>targetm.asm_out.unique_section (decl, reloc);

I'm not convinced this will work properly with LTO, the option from
-flto -c compilation would be ignored.
For functions, we usually mark the option Optimization and make it
saved/restored on function switches, but this is for variables, so it will
not work for those.
I'd think better would be to add when seeing "used" attribute some
artificial attribute (unless we have "retain" attribute to mean that, say
"retain ") when the flag is on in handle_used_attribute and use that
attribute in varasm instead of the option?

> @@ -1227,7 +1227,7 @@ get_variable_section (tree decl, bool prefer_noswitch_p)
>  vnode->get_constructor ();
>  
>if (DECL_COMMON (decl)
> -  && !(SUPPORTS_SHF_GNU_RETAIN && DECL_PRESERVE_P (decl)))
> +  && !(flag_gnu_retain && DECL_PRESERVE_P (decl)))
>  {
>/* If the decl has been given an explicit section name, or it resides
>in a non-generic address space, then it isn't common, and shouldn't
> @@ -7761,7 +7761,7 @@ switch_to_section (section *new_section, tree decl)
>  {
>if (in_section == new_section)
>  {
> -  if (SUPPORTS_SHF_GNU_RETAIN
> +  if (flag_gnu_retain
> && (new_section->common.flags & SECTION_NAMED)
> && decl != nullptr
> && DECL_P (decl)

> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -16168,6 +16168,11 @@ DSOs; if your program relies on reinitialization of 
> a DSO via
> @code{dlclose} and @code{dlopen}, you can use
> @option{-fno-gnu-unique}.
>
> +@item -fgnu-retain
> +@opindex fgnu-retain
> +On systems with recent GNU assembler and linker, the compiler places
> +used symbols in separate SHF_GNU_RETAIN sections.

Again, variables with the used attribute.

Jakub

Re: [PATCH 2/2] openacc: Strided array sections and components of derived-type arrays

2021-02-17 Thread Julian Brown

On Tue, 16 Feb 2021 11:09:17 +0100
Tobias Burnus  wrote:

> On 12.02.21 16:46, Julian Brown wrote:
> > This patch disallows selecting components of array sections in
> > update directives for OpenACC, as specified in OpenACC 3.0,
> > "2.14.4. Update Directive", "Restrictions":
> >
> >"In Fortran, members of variables of derived type may appear,
> > including a subarray of a member. Members of subarrays of derived
> > type may not appear."
> >
> > The diagnostic for attempting to use the same construct on other
> > directives has also been improved.
> >
> > OK for mainline?  
> 
> LGTM.

Thanks. FYI I've committed this version with a merge conflict fixed &
new tests updated.

Julian
commit 366cf1127a547ff77024a551abb01bb1a6e963cd
Author: Julian Brown 
Date:   Wed Feb 10 11:18:13 2021 -0800

openacc: Strided array sections and components of derived-type arrays

This patch disallows selecting components of array sections in update
directives for OpenACC, as specified in OpenACC 3.0, "2.14.4. Update
Directive":

  In Fortran, members of variables of derived type may appear, including
  a subarray of a member. Members of subarrays of derived type may
  not appear.

The diagnostic for attempting to use the same construct on other
directives has also been improved.

gcc/fortran/
* openmp.c (resolve_omp_clauses): Disallow selecting components
of arrays of derived type.

gcc/testsuite/
* gfortran.dg/goacc/array-with-dt-2.f90: Remove expected errors.
* gfortran.dg/goacc/array-with-dt-6.f90: New test.
* gfortran.dg/goacc/mapping-tests-2.f90: Update expected error.
* gfortran.dg/goacc/ref_inquiry.f90: Update expected errors.
* gfortran.dg/gomp/ref_inquiry.f90: Likewise.

libgomp/
* testsuite/libgomp.oacc-fortran/array-stride-dt-1.f90: Remove
expected errors.

diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 1e541ebdafa..bf0179007be 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -5174,17 +5174,31 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
  "are allowed on ORDERED directive at %L",
  &n->where);
 		  }
-		gfc_ref *array_ref = NULL;
+		gfc_ref *lastref = NULL, *lastslice = NULL;
 		bool resolved = false;
 		if (n->expr)
 		  {
-		array_ref = n->expr->ref;
+		lastref = n->expr->ref;
 		resolved = gfc_resolve_expr (n->expr);
 
 		/* Look through component refs to find last array
 		   reference.  */
 		if (resolved)
 		  {
+			for (gfc_ref *ref = n->expr->ref; ref; ref = ref->next)
+			  if (ref->type == REF_COMPONENT
+			  || ref->type == REF_SUBSTRING
+			  || ref->type == REF_INQUIRY)
+			lastref = ref;
+			  else if (ref->type == REF_ARRAY)
+			{
+			  for (int i = 0; i < ref->u.ar.dimen; i++)
+if (ref->u.ar.dimen_type[i] == DIMEN_RANGE)
+  lastslice = ref;
+
+			  lastref = ref;
+			}
+
 			/* The "!$acc cache" directive allows rectangular
 			   subarrays to be specified, with some restrictions
 			   on the form of bounds (not implemented).
@@ -5192,53 +5206,51 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
 			   array isn't contiguous.  An expression such as
 			   arr(-n:n,-n:n) could be contiguous even if it looks
 			   like it may not be.  */
-			if (list != OMP_LIST_CACHE
+			if (code->op != EXEC_OACC_UPDATE
+			&& list != OMP_LIST_CACHE
 			&& list != OMP_LIST_DEPEND
 			&& !gfc_is_simply_contiguous (n->expr, false, true)
-			&& gfc_is_not_contiguous (n->expr))
+			&& gfc_is_not_contiguous (n->expr)
+			&& !(lastslice
+ && (lastslice->next
+ || lastslice->type != REF_ARRAY)))
 			  gfc_error ("Array is not contiguous at %L",
  &n->where);
-
-			while (array_ref
-			   && (array_ref->type == REF_COMPONENT
-   || (array_ref->type == REF_ARRAY
-   && array_ref->next
-   && (array_ref->next->type
-	   == REF_COMPONENT
-			  array_ref = array_ref->next;
 		  }
 		  }
-		if (array_ref
+		if (lastref
 		|| (n->expr
 			&& (!resolved || n->expr->expr_type != EXPR_VARIABLE)))
 		  {
-		if (array_ref
-			&& (array_ref->type == REF_SUBSTRING
-			|| (array_ref->next
-&& array_ref->next->type == REF_SUBSTRING)))
+		if (!lastslice
+			&& lastref
+			&& lastref->type == REF_SUBSTRING)
 		  gfc_error ("Unexpected substring reference in %s clause "
  "at %L", name, &n->where);
-		else if (array_ref && array_ref->type == REF_INQUIRY)
+		else if (!lastslice
+			 && lastref
+			 && lastref->type == REF_INQUIRY)
 		  {
-			gcc_assert (array_ref->u.i == INQUIRY_RE
-|| array_ref->u.i == INQUIRY_IM);
+			gcc_assert (lastref->u.i == INQUIRY_RE
+|| lastref->u.i == INQUIRY_IM);
 			gfc_error ("Unexpected complex-parts designator "
    "reference in %s clause

[AArch64] PR98657: Fix vec_duplicate creation in SVE's 3

2021-02-17 Thread Andre Vieira (lists) via Gcc-patches


Hi,

This patch prevents generating a vec_duplicate with illegal predicate.

Regression tested on aarch64-linux-gnu.

OK for trunk?

gcc/ChangeLog:
2021-02-17  Andre Vieira  

    PR target/98657
    * config/aarch64/aarch64-sve.md: Use 'expand_vector_broadcast' 
to emit vec_duplicate's

    in '3' pattern.

gcc/testsuite/ChangeLog:
2021-02-17  Andre Vieira  

    PR target/98657
    * gcc.target/aarch64/sve/pr98657.c: New test.
diff --git a/gcc/config/aarch64/aarch64-sve.md 
b/gcc/config/aarch64/aarch64-sve.md
index 
608319600318974b414e47285ee1474b041f0e05..7db2938bb84e04d066a7b07574e5cf344a3a8fb6
 100644
--- a/gcc/config/aarch64/aarch64-sve.md
+++ b/gcc/config/aarch64/aarch64-sve.md
@@ -4549,10 +4549,8 @@ (define_expand "3"
   }
 else
   {
-   amount = gen_reg_rtx (mode);
-   emit_insn (gen_vec_duplicate (amount,
-   convert_to_mode (mode,
-operands[2], 0)));
+   amount = convert_to_mode (mode, operands[2], 0);
+   amount = expand_vector_broadcast (mode, amount);
   }
 emit_insn (gen_v3 (operands[0], operands[1], amount));
 DONE;
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr98657.c 
b/gcc/testsuite/gcc.target/aarch64/sve/pr98657.c
new file mode 100644
index 
..592af25d7bbc69bc05823d27358f07cd741dbe20
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pr98657.c
@@ -0,0 +1,9 @@
+/* PR target/98657  */
+/* { dg-do compile } */
+/* { dg-options "-O3 -msve-vector-bits=256" } */
+extern char a[];
+void b(_Bool c[][18]) {
+  int d;
+  for (int e = 0; e < 23; e++)
+a[e] = 6 >> c[1][d];
+}

Re: [PATCH] handle VLA of zero length arrays and vice versa (PR 99121)

2021-02-17 Thread Jakub Jelinek via Gcc-patches

On Tue, Feb 16, 2021 at 08:34:41PM -0700, Martin Sebor via Gcc-patches wrote:
> +   if (integer_all_onesp (nelts))
> + /* Zero length array.  */
> + eltsize = 0;
> +   else
>   {
> -   tree bnds[] = { TYPE_MIN_VALUE (dom), TYPE_MAX_VALUE (dom) };
> -   if (TREE_CODE (arg) == COMPONENT_REF)
> - {
> -   offset_int size = maxobjsize;
> -   if (tree fldsize = component_ref_size (arg))
> - size = wi::to_offset (fldsize);
> -   arrbounds[1] = wi::lrshift (size, eltsizelog2);
> - }
> -   else if (array_at_struct_end_p (arg) || !bnds[0] || !bnds[1])
> - arrbounds[1] = wi::lrshift (maxobjsize, eltsizelog2);
> -   else
> - arrbounds[1] = (wi::to_offset (bnds[1]) - wi::to_offset 
> (bnds[0])
> - + 1) * eltsize;
> +   tree esz = TYPE_SIZE_UNIT (TREE_TYPE (reftype));
> +   if (TREE_CODE (esz) == INTEGER_CST)
> + /* Array element is not a VLA.  */
> + eltsize = wi::to_offset (esz);
>   }
> +
> +   if (!array_at_struct_end_p (arg)
> +   && TREE_CODE (nelts) == INTEGER_CST)
> + arrbounds[1] = (wi::to_offset (nelts) + 1) * eltsize;
> else
> - arrbounds[1] = wi::lrshift (maxobjsize, eltsizelog2);
> + {
> +   /* Use log2 of size to convert the array byte size in to its
> +  upper bound in elements.  */
> +   const offset_int eltsizelog2 = wi::floor_log2 (eltsize);
> +   arrbounds[1] = wi::lrshift (maxobjsize, eltsizelog2);

So, what will this do for zero length arrays at the end of struct?
eltsize will be 0 and wi::floor_log2 (eltsize) is -1, and shifting by -1,
while maybe not UB in wide_int computations, is certainly just weird.
Why do you use eltsize = 0 for the [0] arrays?  They can still have their
element size and if array_at_struct_end_p and the element type is not a variable
length type, using the actual eltsize seems better.
Only when !array_at_struct_end_p we should ensure arrbounds[1] will be 0
and indeed that (wi::to_offset (nelts) + 1) * eltsize would likely not do
that because wi::to_offset is -1ULL or so, not -1.

Also, I'm not sure I understand the right shift by floor_log2 of eltsize,
why can't you simply divide maxobjsize by eltsize (if eltsize is not 0).

Jakub

Re: [PATCH] c++: Fix up build_zero_init_1 once more [PR99106]

2021-02-17 Thread Nathan Sidwell


On 2/17/21 4:50 AM, Jakub Jelinek wrote:

Hi!

My earlier build_zero_init_1 patch for flexible array members created
an empty CONSTRUCTOR.  As the following testcase shows, that doesn't work
very well because the middle-end doesn't expect CONSTRUCTOR elements with
incomplete type (that the empty CONSTRUCTOR at the end of outer CONSTRUCTOR
had).

The following patch just doesn't add any CONSTRUCTOR for the flexible array
members, it doesn't seem to be needed.


lgtm -- yes, an array bound of -1 is certainly odd :)



Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-02-17  Jakub Jelinek  

PR sanitizer/99106
* init.c (build_zero_init_1): For flexible array members just return
NULL_TREE instead of returning empty CONSTRUCTOR with non-complete
ARRAY_TYPE.

* g++.dg/ubsan/pr99106.C: New test.

--- gcc/cp/init.c.jj2021-02-12 23:57:30.501141871 +0100
+++ gcc/cp/init.c   2021-02-16 09:29:24.635069944 +0100
@@ -252,7 +252,7 @@ build_zero_init_1 (tree type, tree nelts
 build_one_cst (TREE_TYPE (nelts)));
/* Treat flexible array members like [0] arrays.  */
else if (TYPE_DOMAIN (type) == NULL_TREE)
-   max_index = build_minus_one_cst (sizetype);
+   return NULL_TREE;
else
max_index = array_type_nelts (type);
  
--- gcc/testsuite/g++.dg/ubsan/pr99106.C.jj	2021-02-16 09:35:50.575679899 +0100

+++ gcc/testsuite/g++.dg/ubsan/pr99106.C2021-02-16 09:35:42.904767167 
+0100
@@ -0,0 +1,5 @@
+// PR sanitizer/99106
+// { dg-do compile }
+// { dg-options "-fsanitize=undefined" }
+
+#include "../ext/flexary38.C"

Jakub




--
Nathan Sidwell

c++: More set_identifier_type_value fixing [PR 99116]

2021-02-17 Thread Nathan Sidwell

My recent change looked	under template_parms in	two places, but	that was 
covering up a separate problem.  We were attempting to set the 
identifier_type_value of a template_parm into the template_parm scope. 
The peeking stopped us doing that, but confused	 poplevel, leaving an 
identifier value lying around.  This fixes the underlying problem in 
do_pushtag -- we only need to set the identifier_type_value directly 
when we're in a template_parm scope (a later pushdecl will push the 
actual	template_decl).	 for non-class non-template-parm bindings 
do_pushdecl already ends up manipulating identifier_type_value correctly.


PR c++/99116
gcc/cp/
* name-lookup.c (do_pushdecl): Don't peek under template_parm
bindings here ...
(set_identifier_type_value_with_scope): ... or here.
(do_pushtag): Only set_identifier_type_value_with_scope at
non-class template parm scope, and use parent scope.
gcc/testsuite/
* g++.dg/lookup/pr99116-1.C: New.
* g++.dg/lookup/pr99116-2.C: New.
--
Nathan Sidwell
diff --git c/gcc/cp/name-lookup.c w/gcc/cp/name-lookup.c
index 5aa206d40d4..0b0fb80c1ff 100644
--- c/gcc/cp/name-lookup.c
+++ w/gcc/cp/name-lookup.c
@@ -3691,14 +3691,9 @@ do_pushdecl (tree decl, bool hiding)
 	if (match == error_mark_node)
 	  ;
 	else if (TREE_CODE (match) == TYPE_DECL)
-	  {
-		auto *l = level;
-		while (l->kind == sk_template_parms)
-		  l = l->level_chain;
-		gcc_checking_assert (REAL_IDENTIFIER_TYPE_VALUE (name)
- == (l->kind == sk_namespace
-	 ? NULL_TREE : TREE_TYPE (match)));
-	  }
+	  gcc_checking_assert (REAL_IDENTIFIER_TYPE_VALUE (name)
+   == (level->kind == sk_namespace
+   ? NULL_TREE : TREE_TYPE (match)));
 	else if (iter.hidden_p () && !hiding)
 	  {
 		/* Unhiding a previously hidden decl.  */
@@ -4756,12 +4751,13 @@ print_binding_stack (void)
 static void
 set_identifier_type_value_with_scope (tree id, tree decl, cp_binding_level *b)
 {
-  while (b->kind == sk_template_parms)
-b = b->level_chain;
-
   if (b->kind == sk_namespace)
 /* At namespace scope we should not see an identifier type value.  */
-gcc_checking_assert (!REAL_IDENTIFIER_TYPE_VALUE (id));
+gcc_checking_assert (!REAL_IDENTIFIER_TYPE_VALUE (id)
+			 /* We could be pushing a friend underneath a template
+			parm (ill-formed).  */
+			 || (TEMPLATE_PARM_P
+			 (TYPE_NAME (REAL_IDENTIFIER_TYPE_VALUE (id);
   else
 {
   /* Push the current type value, so we can restore it later  */
@@ -8257,10 +8253,8 @@ do_pushtag (tree name, tree type, TAG_how how)
   if (decl == error_mark_node)
 	return decl;
 
-  bool in_class = false;
   if (b->kind == sk_class)
 	{
-	  in_class = true;
 	  if (!TYPE_BEING_DEFINED (current_class_type))
 	/* Don't push anywhere if the class is complete; a lambda in an
 	   NSDMI is not a member of the class.  */
@@ -8275,7 +8269,12 @@ do_pushtag (tree name, tree type, TAG_how how)
 	pushdecl_class_level (decl);
 	}
   else if (b->kind == sk_template_parms)
-	in_class = b->level_chain->kind == sk_class;
+	{
+	  /* Do not push the tag here -- we'll want to push the
+	 TEMPLATE_DECL.  */
+	  if (b->level_chain->kind != sk_class)
+	set_identifier_type_value_with_scope (name, tdef, b->level_chain);
+	}
   else
 	{
 	  decl = do_pushdecl_with_scope
@@ -8293,9 +8292,6 @@ do_pushtag (tree name, tree type, TAG_how how)
 	}
 	}
 
-  if (!in_class)
-	set_identifier_type_value_with_scope (name, tdef, b);
-
   TYPE_CONTEXT (type) = DECL_CONTEXT (decl);
 
   /* If this is a local class, keep track of it.  We need this
diff --git c/gcc/testsuite/g++.dg/lookup/pr99116-1.C w/gcc/testsuite/g++.dg/lookup/pr99116-1.C
new file mode 100644
index 000..01b483ea915
--- /dev/null
+++ w/gcc/testsuite/g++.dg/lookup/pr99116-1.C
@@ -0,0 +1,25 @@
+// PR 99116 sliding hidden friends under template parm scopes
+
+template struct Z {
+
+  friend struct T; // { dg-error "shadows template parameter" }
+};
+
+struct Y {
+
+  template struct A {};
+
+  friend struct S;
+};
+
+struct X
+{
+  struct S2 {};
+  
+  struct In
+  {
+friend struct S2;
+  };
+};
+
+typedef int S2;
diff --git c/gcc/testsuite/g++.dg/lookup/pr99116-2.C w/gcc/testsuite/g++.dg/lookup/pr99116-2.C
new file mode 100644
index 000..2a4148bade8
--- /dev/null
+++ w/gcc/testsuite/g++.dg/lookup/pr99116-2.C
@@ -0,0 +1,19 @@
+// PR 99116, we need to remove the namespace-scope meaning of
+// IDENTIFIER_TYPE_VALUE.
+
+namespace __gnu_cxx 
+{
+template
+struct char_traits
+{
+  static void length();
+};
+
+template
+void char_traits<_T>::length ()
+{
+}
+
+}
+   
+struct char_traits;

c++: ICE with header-units [PR 99071]

2021-02-17 Thread Nathan Sidwell

This ICE was caused by dereferencing the wrong pointer and not finding 
the expected thing there.  Pointers	are like that.


PR c++/99071
gcc/cp/
* name-lookup.c (maybe_record_mergeable_decl): Deref the correct
pointer.
gcc/testsuite/
* g++.dg/modules/pr99071_a.H: New.
* g++.dg/modules/pr99071_b.H: New.

--
Nathan Sidwell
diff --git c/gcc/cp/name-lookup.c w/gcc/cp/name-lookup.c
index 5aa206d40d4..fda987e9616 100644
--- c/gcc/cp/name-lookup.c
+++ w/gcc/cp/name-lookup.c
@@ -3525,7 +3525,7 @@ maybe_record_mergeable_decl (tree *slot, tree name, tree decl)
   if (!partition)
 {
   binding_slot &orig
-	= BINDING_VECTOR_CLUSTER (*gslot, 0).slots[BINDING_SLOT_CURRENT];
+	= BINDING_VECTOR_CLUSTER (*slot, 0).slots[BINDING_SLOT_CURRENT];
 
   if (!STAT_HACK_P (tree (orig)))
 	orig = stat_hack (tree (orig));
diff --git c/gcc/testsuite/g++.dg/modules/pr99071_a.H w/gcc/testsuite/g++.dg/modules/pr99071_a.H
new file mode 100644
index 000..44bc7c43601
--- /dev/null
+++ w/gcc/testsuite/g++.dg/modules/pr99071_a.H
@@ -0,0 +1,6 @@
+// PR 99071 ICE with global-module merging
+// { dg-additional-options -fmodule-header }
+// { dg-module-cmi {} }
+
+template
+void begin (T *);
diff --git c/gcc/testsuite/g++.dg/modules/pr99071_b.H w/gcc/testsuite/g++.dg/modules/pr99071_b.H
new file mode 100644
index 000..1c773d74f12
--- /dev/null
+++ w/gcc/testsuite/g++.dg/modules/pr99071_b.H
@@ -0,0 +1,8 @@
+// PR 99071 ICE with global-module merging
+// { dg-additional-options -fmodule-header }
+// { dg-module-cmi {} }
+
+import "pr99071_a.H";
+
+template
+void begin(T &);

Re: [PATCH] profiling: fix streaming of TOPN counters

2021-02-17 Thread Martin Liška


On 2/17/21 9:36 AM, Martin Liška wrote:

I'll write it even more robust...


This is more elegant approach I've just tested on the instrumented clang.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin
>From a8a879ba8a8ce3e90a712ff4c436b5d993faad7d Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 16 Feb 2021 16:28:06 +0100
Subject: [PATCH] profiling: fix streaming of TOPN counters

libgcc/ChangeLog:

	PR gcov-profile/99105
	* libgcov-driver.c (write_top_counters): Rename to ...
	(write_topn_counters): ... this.
	(write_one_data): Pre-allocate buffer for number of items
	in the corresponding linked lists.
	* libgcov-merge.c (__gcov_merge_topn): Use renamed function.

gcc/testsuite/ChangeLog:

	PR gcov-profile/99105
	* gcc.dg/tree-prof/indir-call-prof-malloc.c: Use profile
	correction as the wrapped malloc is called one more time
	from libgcov.
---
 .../gcc.dg/tree-prof/indir-call-prof-malloc.c |  2 +-
 libgcc/libgcov-driver.c   | 37 +++
 libgcc/libgcov-merge.c|  1 +
 3 files changed, 31 insertions(+), 9 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-malloc.c b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-malloc.c
index 454e224c95f..7bda4aedfc8 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-malloc.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-malloc.c
@@ -1,4 +1,4 @@
-/* { dg-options "-O2 -ldl" } */
+/* { dg-options "-O2 -ldl -fprofile-correction" } */
 
 #define _GNU_SOURCE
 #include 
diff --git a/libgcc/libgcov-driver.c b/libgcc/libgcov-driver.c
index e474e032b54..e6efa97254e 100644
--- a/libgcc/libgcov-driver.c
+++ b/libgcc/libgcov-driver.c
@@ -337,32 +337,53 @@ read_error:
 /* Store all TOP N counters where each has a dynamic length.  */
 
 static void
-write_top_counters (const struct gcov_ctr_info *ci_ptr,
-		unsigned t_ix,
-		gcov_unsigned_t n_counts)
+write_topn_counters (const struct gcov_ctr_info *ci_ptr,
+		 unsigned t_ix,
+		 gcov_unsigned_t n_counts)
 {
   unsigned counters = n_counts / GCOV_TOPN_MEM_COUNTERS;
   gcc_assert (n_counts % GCOV_TOPN_MEM_COUNTERS == 0);
+
+  /* It can happen in a multi-threaded environment that number of counters is
+ different from the size of the corresponding linked lists.  */
+  unsigned *list_sizes = (unsigned *) xcalloc (counters, sizeof (unsigned));
   unsigned pair_total = 0;
+
   for (unsigned i = 0; i < counters; i++)
-pair_total += ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 1];
+{
+  gcov_type start = ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 2];
+  for (struct gcov_kvp *node = (struct gcov_kvp *)(intptr_t)start;
+	   node != NULL; node = node->next)
+	{
+	  ++pair_total;
+	  ++list_sizes[i];
+	}
+}
+
   unsigned disk_size = GCOV_TOPN_DISK_COUNTERS * counters + 2 * pair_total;
   gcov_write_tag_length (GCOV_TAG_FOR_COUNTER (t_ix),
 			 GCOV_TAG_COUNTER_LENGTH (disk_size));
 
   for (unsigned i = 0; i < counters; i++)
 {
-  gcov_type pair_count = ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 1];
   gcov_write_counter (ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i]);
-  gcov_write_counter (pair_count);
+  gcov_write_counter (list_sizes[i]);
   gcov_type start = ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 2];
+
+  unsigned j = 0;
   for (struct gcov_kvp *node = (struct gcov_kvp *)(intptr_t)start;
-	   node != NULL; node = node->next)
+	   node != NULL; node = node->next, j++)
 	{
 	  gcov_write_counter (node->value);
 	  gcov_write_counter (node->count);
+
+	  /* Stop when we reach expected number of items.  */
+	  if (j + 1 == list_sizes[i])
+	break;
 	}
 }
+
+  free (list_sizes);
 }
 
 /* Write counters in GI_PTR and the summary in PRG to a gcda file. In
@@ -425,7 +446,7 @@ write_one_data (const struct gcov_info *gi_ptr,
 	  n_counts = ci_ptr->num;
 
 	  if (t_ix == GCOV_COUNTER_V_TOPN || t_ix == GCOV_COUNTER_V_INDIR)
-	write_top_counters (ci_ptr, t_ix, n_counts);
+	write_topn_counters (ci_ptr, t_ix, n_counts);
 	  else
 	{
 	  /* Do not stream when all counters are zero.  */
diff --git a/libgcc/libgcov-merge.c b/libgcc/libgcov-merge.c
index 7db188a4f4c..3379b688128 100644
--- a/libgcc/libgcov-merge.c
+++ b/libgcc/libgcov-merge.c
@@ -109,6 +109,7 @@ __gcov_merge_topn (gcov_type *counters, unsigned n_counters)
   /* First value is number of total executions of the profiler.  */
   gcov_type all = gcov_get_counter_ignore_scaling (-1);
   gcov_type n = gcov_get_counter_ignore_scaling (-1);
+  gcc_assert (n <= GCOV_TOPN_MAXIMUM_TRACKED_VALUES);
 
   unsigned full = all < 0;
   gcov_type *total = &counters[GCOV_TOPN_MEM_COUNTERS * i];
-- 
2.30.0

Re: [PATCH] split, i386, v5: Fix up df uses in i386 splitters [PR99104]

2021-02-17 Thread Jakub Jelinek via Gcc-patches

On Wed, Feb 17, 2021 at 10:30:06AM +, Richard Sandiford wrote:
> Hmm.  I think that just means that the optimisation performed by
> the copy constructor isn't valid in practice (even if it should be
> in principle).  Guess this is the curse of manipulating data structures
> directly rather than through well-defined interfaces :-)
> 
> TBH, since the obvious thing didn't work, I think we should keep it
> simple and take the hit of the full copy.  It seems less error-prone
> and I'd be surprised if the overhead was noticeable in practice.
> It also means that if we do manage to optimise the copy in future,
> all callers would automatically benefit.
> 
> I.e. my preference would be to go for your original patch without
> the recog.[hc] bits.

It is more than 1KB there and back each time we call one of those
splitter conditions.
When we know only the recog_data.{operand,insn} part is needed...
Only copying that would be 240B there, back + clear of 8 bytes.
Anyway, don't feel strongly about that.

Jakub

Re: [PATCH] MIPS: Fix PR target/98491 (ChangeLog)

2021-02-17 Thread Richard Sandiford via Gcc-patches

Xi Ruoyao via Gcc-patches  writes:
>> > > > I can't understand the comment either.  To me it looks like it's
>> > > > possible to
>> > > > remove this "if (MSA_SUPPORTED_P (mode)) return 0;"

I think the point is that the MSA loads and stores only have a 10-bit
offset field instead of the usual 16-bit offset field and so the usual
approaches to handling symbolic addresses won't work.

>> > > > 
>> > > > CC Robert to get some help.
>> > > Happy new lunar year folks.
>> > > 
>> > > I found a newer email address of Robert.  Hope it is still being used.
>> > > 
>> > > Could someone update MAINTAINERS file by the way?
>> > If you have an updated email address, I can reach out to Robert and see
>> > if he wants his entry updated or removed.
>> 
>>  His latest reply in gcc mail lists used robert.sucha...@mips.com.  But when 
>> I
>> sent mail to it, the mail was just rejected with "access denied".  Google 
>> told
>> me Office 365 mail service (used by mips.com) rejects mail to deleted 
>> accounts
>> with "access denied". So I'm not sure if this email address is invalid again,
>> or
>> Office 365 just dislikes me...
>
> Hi Jeff,
>
> I think it's better to just fix the out-of-bound array access now by special
> casing MAX_MACHINE_MODE, if we can't figure out if this entry should be 
> removed.
> Either in MSA_SUPPORTED_P or in mips_symbol_insns.
>
> It's really "irrational" to leave such a obvious programming error in new GCC 
> 11
> release...  And I've built a Linux system (in Linux From Scratch way, X11 was
> built and it runs correctly now) on the Loongson 3A4000 with patched 
> GCC-10.2.0,
> and "-O3 -mmsa" in CFLAGS for most packages so the change should be OK.

Yeah, agreed.  I think the mips_symbol_insns patch is the right one,
so I've pushed to it trunk.  I think it's also worth backporting
to release branches, but let me know how far back you've tested it.

Thanks for the patch.

Richard

Re: [PATCH] Add -fgnu-retain/-fno-gnu-retain

2021-02-17 Thread Jozef Lawrynowicz

On Tue, Feb 16, 2021 at 05:45:47PM +0100, Jakub Jelinek via Gcc-patches wrote:
> On Mon, Feb 15, 2021 at 02:35:07PM -0800, H.J. Lu via Gcc-patches wrote:
> > When building Linux kernel, ld in bninutils 2.36 with GCC 11 generates
> > thousands of
> > 
> > ld: warning: orphan section `.data.event_initcall_finish' from 
> > `init/main.o' being placed in section `.data.event_initcall_finish'
> > ld: warning: orphan section `.data.event_initcall_start' from `init/main.o' 
> > being placed in section `.data.event_initcall_start'
> > ld: warning: orphan section `.data.event_initcall_level' from `init/main.o' 
> > being placed in section `.data.event_initcall_level'
> > 
> > Since these sections are marked with SHF_GNU_RETAIN, they are placed in
> > separate sections.  They become orphan sections since they aren't expected
> > in the Linux kernel linker script. But orphan sections normally don't work
> > well with the Linux kernel linker script and the resulting kernel crashed.
> > 
> > Add -fgnu-retain to disable SHF_GNU_RETAIN for Linux kernel build with
> > -fno-gnu-retain.
> 
> I'd say this shows that the changing of meaning of "used" attribute wasn't
> really a good idea, the Linux kernel certainly won't be the only problem
> and people use "used" attribute for many reasons and don't necessarily want
> the different sections or different behavior of those sections if they use
> it.
> 
> So, can't we instead:
> 1) restore the old behavior of "used" by default
> 2) add "retain" attribute that implies the SHF_GNU_RETAIN stuff and fails
>if the configured assembler/linker doesn't support those
> 3) add a non-default option through which one could opt in for "used"
>attribute to imply retain attribute too for projects that want that?
> 

In previous discussions, it seemed to me that there was general support
for the "used" attribute implying SHF_GNU_RETAIN and saving symbols from
linker garbage collection[1]. Of course, the current implementation
results in undesirable behavior - the thought that all linker scripts
not supporting uniquely named sections would need to be updated is quite
alarming.

It's a shame that all this extra complication is required, just because
we cannot have a ".retain ", directive.

My preferred vision for this functionality was:
  - SHF_GNU_RETAIN section flag indicates a section should be saved
from linker garbage collection.
  - ".retain " assembler directive applies SHF_GNU_RETAIN
to the section containing .
  - GCC "used" attribute emits a ".retain " directive for
the symbol declaration is is applied to.  Applying the "used"
attribute to a symbol declaration does not change the structure of
the object file, beyond applying SHF_GNU_RETAIN to the section
containing that symbol.

H.J. vetoed ".retain "[2], since it fails the predicate:
  Assembler directive referring to a symbol must operate on the symbol
  table.

So because of this veto, we have compromised on "code quality" so far,
since any linker script not supporting named sections, used to link an
application containing "used" symbols (now put into their own section) has
potential undefined behavior.

With the new proposed change to use a "retain" attribute, we now
compromise on functionality, since the "used" attribute saving symbols
from linker garbage collection is disabled by default. 

All because we cannot introduce an exception to the above predicate.

I would like to re-open discussions on whether a ".retain 
directive is acceptable. This would enable "used" to imply
SHF_GNU_RETAIN, without changing the structure of the object file,
thereby allowing the new functionality to be used without linker script
modifications.

If a Binutils global maintainer could side one way or the other on
".retain " (an opinion was never given by the Binutils
maintainers when we had the discussions on that mailing list, although
Jeff did support .retain in the GCC discussions[3]), then it will be
easier to proceed knowing definitively that ".retain" is rejected and we
have no choice but to go this route of having a separate "retain"
attribute.

Thanks,
Jozef

[1] https://gcc.gnu.org/pipermail/gcc-patches/2020-October/557097.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558123.html
[3] https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558398.html

Re: [PATCH] MIPS: Fix PR target/98491 (ChangeLog)

2021-02-17 Thread Xi Ruoyao via Gcc-patches

On 2021-02-16 11:59 +0800, Xi Ruoyao wrote:
> On 2021-02-15 16:16 -0700, Jeff Law wrote:
> > 
> > 
> > On 2/12/21 7:17 AM, Xi Ruoyao wrote:
> > > On 2021-01-11 01:01 +0800, Xi Ruoyao wrote:
> > > > Hi Jeff and Jakub,
> > > > 
> > > > On 2021-01-04 14:19 -0700, Jeff Law wrote:
> > > > > On 1/4/21 2:00 PM, Jakub Jelinek wrote:
> > > > > > On Mon, Jan 04, 2021 at 01:51:59PM -0700, Jeff Law via Gcc-patches
> > > > > > wrote:
> > > > > > > > Sorry, I forgot to include the ChangeLog:
> > > > > > > > 
> > > > > > > >     gcc/ChangeLog:
> > > > > > > >     
> > > > > > > >     2020-12-31  Xi Ruoyao 
> > > > > > > >     
> > > > > > > >     PR target/98491
> > > > > > > >     * config/mips/mips.c (mips_symbol_insns): Do not use
> > > > > > > >   MSA_SUPPORTED_MODE_P if mode is MAX_MACHINE_MODE.
> > > > > > > So I absolutely agree the current code is wrong as it does an out
> > > > > > > of
> > > > > > > bounds array access.
> > > > > > > 
> > > > > > > 
> > > > > > > Would it be better to instead to change MSA_SUPPORTED_MODE_P to
> > > > > > > evaluate
> > > > > > > to zero if MODE is MAX_MACHINE_MODE?  That would protect all the
> > > > > > > uses
> > > > > > > of
> > > > > > > MSA_SUPPORTED_MODE_P.    Something like this perhaps?
> > > > > > But MAX_MACHINE_MODE is the one past last valid mode, I'm not aware
> > > > > > of
> > > > > > any target that would protect all macros that deal with modes that
> > > > > > way.
> > > > > > 
> > > > > > So, perhaps best would be stop using the MAX_MACHINE_MODE as magic
> > > > > > value
> > > > > > for that function and instead use say VOIDmode that shouldn't
> > > > > > normally
> > > > > > appear either?
> > > > > I think we have to allow VOIDmode because constants don't necessarily
> > > > > have modes.   And I certainly agree that using MAX_MACHINE_MODE like
> > > > > this is ugly and error prone (as we can see from the BZ).
> > > > > 
> > > > > I also couldn't convince myself that the code and comments were
> > > > > actually
> > > > > consistent, particularly for MSA targets which the comment claims can
> > > > > never handle constants for ld/st (and thus should be returning 0 for
> > > > > MAX_MACHINE_MODE).  Though maybe mips_symbol_insns_1 ultimately
> > > > > handles
> > > > > that correctly.
> > > > > 
> > > > > 
> > > > > > But I don't really see anything wrong on the mips_symbol_insns above
> > > > > > change either.
> > > > > Me neither.  I'm just questioning if bullet-proofing in the
> > > > > MSA_SUPPORTED_MODE_P would be a better option.  While I've worked in
> > > > > the
> > > > > MIPS port in the past, I don't really have any significannt experience
> > > > > with the MSA support.
> > > > I can't understand the comment either.  To me it looks like it's
> > > > possible to
> > > > remove this "if (MSA_SUPPORTED_P (mode)) return 0;"
> > > > 
> > > > CC Robert to get some help.
> > > Happy new lunar year folks.
> > > 
> > > I found a newer email address of Robert.  Hope it is still being used.
> > > 
> > > Could someone update MAINTAINERS file by the way?
> > If you have an updated email address, I can reach out to Robert and see
> > if he wants his entry updated or removed.
> 
>  His latest reply in gcc mail lists used robert.sucha...@mips.com.  But when I
> sent mail to it, the mail was just rejected with "access denied".  Google told
> me Office 365 mail service (used by mips.com) rejects mail to deleted accounts
> with "access denied". So I'm not sure if this email address is invalid again,
> or
> Office 365 just dislikes me...

Hi Jeff,

I think it's better to just fix the out-of-bound array access now by special
casing MAX_MACHINE_MODE, if we can't figure out if this entry should be removed.
Either in MSA_SUPPORTED_P or in mips_symbol_insns.

It's really "irrational" to leave such a obvious programming error in new GCC 11
release...  And I've built a Linux system (in Linux From Scratch way, X11 was
built and it runs correctly now) on the Loongson 3A4000 with patched GCC-10.2.0,
and "-O3 -mmsa" in CFLAGS for most packages so the change should be OK.
-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University

Re: [PATCH] split, i386, v5: Fix up df uses in i386 splitters [PR99104]

2021-02-17 Thread Richard Sandiford via Gcc-patches

Jakub Jelinek  writes:
> On Tue, Feb 16, 2021 at 03:03:56PM +, Richard Sandiford via Gcc-patches 
> wrote:
>> > On Tue, Feb 16, 2021 at 01:09:43PM +, Richard Sandiford wrote:
>> >> Can I put in a plea to put this in recog.[hc], and possibly also make
>> >> it a copy constructor for recog_data_d?  I can't think of any legitimate
>> >> cases in which we'd want to copy the whole structure, instead of just
>> >> the active parts.
>> >
>> > Good idea.  So like this?
>> >
>> > 2021-02-16  Jakub Jelinek  
>> >
>> >PR target/99104
>> >* recog.h (recog_data_d::recog_data_d ()): New defaulted default ctor.
>> >(recog_data_d::recog_data_d (const recog_data_d &)): Declare.
>> >(recog_data_d::operator = (const recog_data_d &)): Declare.
>> >* recog.c (recog_data_d::operator = (const recog_data_d &)): New
>> >assignment operator.
>> >(recog_data_d::recog_data_d (const recog_data_d &)): New copy ctor.
>> 
>> OK, thanks.
>
> Unfortunately, that caused regressions:
> +FAIL: gcc.target/i386/pad-3.c (internal compiler error)
> +FAIL: gcc.target/i386/pad-3.c (test for excess errors)
> +UNRESOLVED: gcc.target/i386/pad-3.c scan-assembler-not nop
> +UNRESOLVED: gcc.target/i386/pad-3.c scan-assembler-not rep
> +FAIL: gcc.target/i386/pr93611.c (internal compiler error)
> +FAIL: gcc.target/i386/pr93611.c (test for excess errors)
>
> The thing is that both generated split_insns and peephole2_insns functions
> start with:
>   rtx * const operands ATTRIBUTE_UNUSED = &recog_data.operand[0];
> ...
>   recog_data.insn = NULL;
> and then mix various tests with operands[???] = ...; statements.
> So, the recog.[ch] added copy ctor and assignment operator can be perhaps
> useful for other cases but not for the splitter/peephole2 conditions.
>
> So, either we can still use those and then copy rest of operands array
> (there is nothing that tells how many are valid), like in the patch below
> that passed x86_64-linux and i686-linux bootstrap and regtests, i.e.
>   /* Remember recog_data content.  */
>   struct recog_data_d recog_data_save = recog_data;
>   /* While recog_data normally contains just n_operands operands in
>  operand array, during split_all_insns or peephole2 it can contain
>  arbitrary number of operands that are being matched.  */
>   for (int i = recog_data.n_operands; i < MAX_RECOG_OPERANDS; i++)
> recog_data_save.operand[i] = recog_data.operand[i];
>
>   dist_define = distance_non_agu_define (regno1, regno2, insn);
>   dist_use = distance_agu_use (regno0, insn);
>
>   /* distance_non_agu_define can call get_attr_type which can call
>  recog_memoized, restore recog_data back to previous content.  */
>   recog_data = recog_data_save;
>   for (int i = recog_data_save.n_operands; i < MAX_RECOG_OPERANDS; i++)
> recog_data.operand[i] = recog_data_save.operand[i];
> or we could save less by:
>   rtx operands[MAX_RECOG_OPERANDS];
>   memcpy (operands, recog_data.operand, sizeof operands);
>
>   dist_define = distance_non_agu_define (regno1, regno2, insn);
>   dist_use = distance_agu_use (regno0, insn);
>
>   memcpy (recog_data.operand, operands, sizeof operands);
>   recog_data.insn = NULL;
> and up to you whether the recog.[hc] changes should be kept or not.

Hmm.  I think that just means that the optimisation performed by
the copy constructor isn't valid in practice (even if it should be
in principle).  Guess this is the curse of manipulating data structures
directly rather than through well-defined interfaces :-)

TBH, since the obvious thing didn't work, I think we should keep it
simple and take the hit of the full copy.  It seems less error-prone
and I'd be surprised if the overhead was noticeable in practice.
It also means that if we do manage to optimise the copy in future,
all callers would automatically benefit.

I.e. my preference would be to go for your original patch without
the recog.[hc] bits.

Thanks,
Richard


>
> In any case, I'm surprised why what the splitters did so far worked at all,
> because it isn't always the case that the split_insns or peephole2_insns
> filled operands don't really need to match what extract_insn would fill.
>
> Sorry for this taking so long.
>
> 2021-02-17  Jakub Jelinek  
>
>   PR target/99104
>   * recog.h (recog_data_d::recog_data_d ()): New defaulted default ctor.
>   (recog_data_d::recog_data_d (const recog_data_d &)): Declare.
>   (recog_data_d::operator = (const recog_data_d &)): Declare.
>   * recog.c (recog_data_d::operator = (const recog_data_d &)): New
>   assignment operator.
>   (recog_data_d::recog_data_d (const recog_data_d &)): New copy ctor.
>   * config/i386/i386.c (distance_non_agu_define): Don't call
>   extract_insn_cached here.
>   (ix86_lea_outperforms): Save and restore recog_data around call
>   to distance_non_agu_define and distance_agu_use.
>   (ix86_ok_to_clobber_flags): Remove.
>   (ix86_avoid_lea_for_add): Don't call ix86_ok_to_clobber_flags.
>

[PATCH] array-bounds: Fix up ICE on overaligned variables [PR99109]

2021-02-17 Thread Jakub Jelinek via Gcc-patches

Hi!

check_mem_ref builds artificial arrays for variables that don't have
array type.
The C standard says:
"For the purposes of these operators, a pointer to an object that is not an 
element of an
array behaves the same as a pointer to the first element of an array of length 
one with the
type of the object as its element type."
so it isn't completely wrong and does simplify the function.
But, layout_type can fail if the size of the element type is not a multiple
of its alignment (i.e. overaligned types) and we then ICE because of that.

The following patch uses TYPE_MAIN_VARIANT in those cases instead, but only
for the types that need it, as for the diagnostics it is better to use the
typedef names etc. that were really used in the source if possible.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-02-17  Jakub Jelinek  

PR middle-end/99109
* gimple-array-bounds.cc (overaligned_type_p): New function.
(array_bounds_checker::check_mem_ref): For overaligned types use
TYPE_MAIN_VARIANT of reftype as element type of a new array.

* g++.dg/warn/Warray-bounds-17.C: New test.

--- gcc/gimple-array-bounds.cc.jj   2021-01-04 10:25:37.471249246 +0100
+++ gcc/gimple-array-bounds.cc  2021-02-16 17:00:41.961750114 +0100
@@ -386,6 +386,21 @@ build_zero_elt_array_type (tree eltype)
   return arrtype;
 }
 
+/* Return true if it is not possible to build an array with element type
+   ELTYPE, because either ELTYPE has alignment larger than its size
+   or its size is not a multiple of its alignment.  */
+
+static bool
+overaligned_type_p (tree eltype)
+{
+  return (TYPE_SIZE_UNIT (eltype)
+ && TREE_CODE (TYPE_SIZE_UNIT (eltype)) == INTEGER_CST
+ && !integer_zerop (TYPE_SIZE_UNIT (eltype))
+ && TYPE_ALIGN_UNIT (eltype) > 1
+ && wi::zext (wi::to_wide (TYPE_SIZE_UNIT (eltype)),
+  ffs_hwi (TYPE_ALIGN_UNIT (eltype)) - 1) != 0);
+}
+
 /* Checks one MEM_REF in REF, located at LOCATION, for out-of-bounds
references to string constants.  If VRP can determine that the array
subscript is a constant, check if it is outside valid range.
@@ -553,6 +568,8 @@ array_bounds_checker::check_mem_ref (loc
   reftype = TREE_TYPE (TREE_TYPE (arg));
   if (TREE_CODE (reftype) == ARRAY_TYPE)
reftype = TREE_TYPE (reftype);
+  else if (overaligned_type_p (reftype))
+   reftype = TYPE_MAIN_VARIANT (reftype);
   if (tree refsize = TYPE_SIZE_UNIT (reftype))
if (TREE_CODE (refsize) == INTEGER_CST)
  eltsize = wi::to_offset (refsize);
@@ -675,7 +692,11 @@ array_bounds_checker::check_mem_ref (loc
   /* Treat a reference to a non-array object as one to an array
 of a single element.  */
   if (TREE_CODE (reftype) != ARRAY_TYPE)
-   reftype = build_array_type_nelts (reftype, 1);
+   {
+ if (overaligned_type_p (reftype))
+   reftype = TYPE_MAIN_VARIANT (reftype);
+ reftype = build_array_type_nelts (reftype, 1);
+   }
 
   /* Extract the element type out of MEM_REF and use its size
 to compute the index to print in the diagnostic; arrays
--- gcc/testsuite/g++.dg/warn/Warray-bounds-17.C.jj 2021-02-16 
17:24:14.178813304 +0100
+++ gcc/testsuite/g++.dg/warn/Warray-bounds-17.C2021-02-16 
17:23:35.305251062 +0100
@@ -0,0 +1,15 @@
+// PR middle-end/99109
+// { dg-do compile }
+// { dg-options "-O2 -Warray-bounds" }
+
+typedef int A __attribute__((aligned (64)));
+void foo (int *);
+
+void
+bar (void)
+{
+  A b; // { dg-message "while referencing" }
+  int *p = &b;
+  int *x = (p - 1);// { dg-warning "outside array bounds" }
+  foo (x);
+}

Jakub

[PATCH] i386: Avoid C++ global constructors in every object that includes i386.h

2021-02-17 Thread Jakub Jelinek via Gcc-patches

Hi!

When looking at recog.o when working on the recog.[ch] changes to make sure
I have not introduced runtime construction of recog_data variable, I have
noticed that at least in unoptimized build, every single *.o file that
included i386.h has lots of runtime constructors for all the PTA_*
variables.

As we now require C++11, the following patch makes those constexpr so that
they don't need runtime initialization.
I've verified that ~ 8276 bytes long 
_Z41__static_initialization_and_destruction_0ii
at -O0 is gone from every *.o that included i386.h (and doesn't really need
any global ctors anymore).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-02-17  Jakub Jelinek  

* wide-int-bitmask.h (wide_int_bitmask::wide_int_bitmask (),
wide_int_bitmask::wide_int_bitmask (uint64_t),
wide_int_bitmask::wide_int_bitmask (uint64_t, uint64_t),
wide_int_bitmask::operator ~ () const,
wide_int_bitmask::operator | (wide_int_bitmask) const,
wide_int_bitmask::operator & (wide_int_bitmask) const): Use constexpr
instead of inline.
* config/i386/i386.h (PTA_3DNOW, PTA_3DNOW_A, PTA_64BIT, PTA_ABM,
PTA_AES, PTA_AVX, PTA_BMI, PTA_CX16, PTA_F16C, PTA_FMA, PTA_FMA4,
PTA_FSGSBASE, PTA_LWP, PTA_LZCNT, PTA_MMX, PTA_MOVBE, PTA_NO_SAHF,
PTA_PCLMUL, PTA_POPCNT, PTA_PREFETCH_SSE, PTA_RDRND, PTA_SSE, PTA_SSE2,
PTA_SSE3, PTA_SSE4_1, PTA_SSE4_2, PTA_SSE4A, PTA_SSSE3, PTA_TBM,
PTA_XOP, PTA_AVX2, PTA_BMI2, PTA_RTM, PTA_HLE, PTA_PRFCHW, PTA_RDSEED,
PTA_ADX, PTA_FXSR, PTA_XSAVE, PTA_XSAVEOPT, PTA_AVX512F, PTA_AVX512ER,
PTA_AVX512PF, PTA_AVX512CD, PTA_NO_TUNE, PTA_SHA, PTA_PREFETCHWT1,
PTA_CLFLUSHOPT, PTA_XSAVEC, PTA_XSAVES, PTA_AVX512DQ, PTA_AVX512BW,
PTA_AVX512VL, PTA_AVX512IFMA, PTA_AVX512VBMI, PTA_CLWB, PTA_MWAITX,
PTA_CLZERO, PTA_NO_80387, PTA_PKU, PTA_AVX5124VNNIW, PTA_AVX5124FMAPS,
PTA_AVX512VPOPCNTDQ, PTA_SGX, PTA_AVX512VNNI, PTA_GFNI, PTA_VAES,
PTA_AVX512VBMI2, PTA_VPCLMULQDQ, PTA_AVX512BITALG, PTA_RDPID,
PTA_PCONFIG, PTA_WBNOINVD, PTA_AVX512VP2INTERSECT, PTA_PTWRITE,
PTA_AVX512BF16, PTA_WAITPKG, PTA_MOVDIRI, PTA_MOVDIR64B, PTA_ENQCMD,
PTA_CLDEMOTE, PTA_SERIALIZE, PTA_TSXLDTRK, PTA_AMX_TILE, PTA_AMX_INT8,
PTA_AMX_BF16, PTA_UINTR, PTA_HRESET, PTA_KL, PTA_WIDEKL, PTA_AVXVNNI,
PTA_X86_64_BASELINE, PTA_X86_64_V2, PTA_X86_64_V3, PTA_X86_64_V4,
PTA_CORE2, PTA_NEHALEM, PTA_WESTMERE, PTA_SANDYBRIDGE, PTA_IVYBRIDGE,
PTA_HASWELL, PTA_BROADWELL, PTA_SKYLAKE, PTA_SKYLAKE_AVX512,
PTA_CASCADELAKE, PTA_COOPERLAKE, PTA_CANNONLAKE, PTA_ICELAKE_CLIENT,
PTA_ICELAKE_SERVER, PTA_TIGERLAKE, PTA_SAPPHIRERAPIDS, PTA_ALDERLAKE,
PTA_KNL, PTA_BONNELL, PTA_SILVERMONT, PTA_GOLDMONT, PTA_GOLDMONT_PLUS,
PTA_TREMONT, PTA_KNM): Use constexpr instead of const.

--- gcc/wide-int-bitmask.h.jj   2021-01-04 10:25:38.611236338 +0100
+++ gcc/wide-int-bitmask.h  2021-02-16 15:55:55.848542878 +0100
@@ -23,14 +23,14 @@ along with GCC; see the file COPYING3.
 class wide_int_bitmask
 {
 public:
-  inline wide_int_bitmask ();
-  inline wide_int_bitmask (uint64_t l);
-  inline wide_int_bitmask (uint64_t l, uint64_t h);
+  constexpr wide_int_bitmask ();
+  constexpr wide_int_bitmask (uint64_t l);
+  constexpr wide_int_bitmask (uint64_t l, uint64_t h);
   inline wide_int_bitmask &operator &= (wide_int_bitmask);
   inline wide_int_bitmask &operator |= (wide_int_bitmask);
-  inline wide_int_bitmask operator ~ () const;
-  inline wide_int_bitmask operator & (wide_int_bitmask) const;
-  inline wide_int_bitmask operator | (wide_int_bitmask) const;
+  constexpr wide_int_bitmask operator ~ () const;
+  constexpr wide_int_bitmask operator & (wide_int_bitmask) const;
+  constexpr wide_int_bitmask operator | (wide_int_bitmask) const;
   inline wide_int_bitmask operator >> (int);
   inline wide_int_bitmask operator << (int);
   inline bool operator == (wide_int_bitmask) const;
@@ -38,19 +38,19 @@ public:
   uint64_t low, high;
 };
 
-inline
+constexpr
 wide_int_bitmask::wide_int_bitmask ()
 : low (0), high (0)
 {
 }
 
-inline
+constexpr
 wide_int_bitmask::wide_int_bitmask (uint64_t l)
 : low (l), high (0)
 {
 }
 
-inline
+constexpr
 wide_int_bitmask::wide_int_bitmask (uint64_t l, uint64_t h)
 : low (l), high (h)
 {
@@ -72,25 +72,22 @@ wide_int_bitmask::operator |= (wide_int_
   return *this;
 }
 
-inline wide_int_bitmask
+constexpr wide_int_bitmask
 wide_int_bitmask::operator ~ () const
 {
-  wide_int_bitmask ret (~low, ~high);
-  return ret;
+  return wide_int_bitmask (~low, ~high);
 }
 
-inline wide_int_bitmask
+constexpr wide_int_bitmask
 wide_int_bitmask::operator | (wide_int_bitmask b) const
 {
-  wide_int_bitmask ret (low | b.low, high | b.high);
-  return ret;
+  return wide_int_bitmask (low | b.low, high | b.high);
 }
 
-inline wide_int_bitmask
+constexpr wide_int_bitmask
 wide_int_bitmask::operator & (wide_int_

New Russian PO file for 'cpplib' (version 11.1-b20210207)

2021-02-17 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Russian team of translators.  The file is available at:

https://translationproject.org/latest/cpplib/ru.po

(This file, 'cpplib-11.1-b20210207.ru.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Contents of PO file 'cpplib-11.1-b20210207.ru.po'

2021-02-17 Thread Translation Project Robot



cpplib-11.1-b20210207.ru.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.

[PATCH] c++: Fix up build_zero_init_1 once more [PR99106]

2021-02-17 Thread Jakub Jelinek via Gcc-patches

Hi!

My earlier build_zero_init_1 patch for flexible array members created
an empty CONSTRUCTOR.  As the following testcase shows, that doesn't work
very well because the middle-end doesn't expect CONSTRUCTOR elements with
incomplete type (that the empty CONSTRUCTOR at the end of outer CONSTRUCTOR
had).

The following patch just doesn't add any CONSTRUCTOR for the flexible array
members, it doesn't seem to be needed.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-02-17  Jakub Jelinek  

PR sanitizer/99106
* init.c (build_zero_init_1): For flexible array members just return
NULL_TREE instead of returning empty CONSTRUCTOR with non-complete
ARRAY_TYPE.

* g++.dg/ubsan/pr99106.C: New test.

--- gcc/cp/init.c.jj2021-02-12 23:57:30.501141871 +0100
+++ gcc/cp/init.c   2021-02-16 09:29:24.635069944 +0100
@@ -252,7 +252,7 @@ build_zero_init_1 (tree type, tree nelts
 build_one_cst (TREE_TYPE (nelts)));
   /* Treat flexible array members like [0] arrays.  */
   else if (TYPE_DOMAIN (type) == NULL_TREE)
-   max_index = build_minus_one_cst (sizetype);
+   return NULL_TREE;
   else
max_index = array_type_nelts (type);
 
--- gcc/testsuite/g++.dg/ubsan/pr99106.C.jj 2021-02-16 09:35:50.575679899 
+0100
+++ gcc/testsuite/g++.dg/ubsan/pr99106.C2021-02-16 09:35:42.904767167 
+0100
@@ -0,0 +1,5 @@
+// PR sanitizer/99106
+// { dg-do compile }
+// { dg-options "-fsanitize=undefined" }
+
+#include "../ext/flexary38.C"

Jakub

[PATCH] split, i386, v5: Fix up df uses in i386 splitters [PR99104]

2021-02-17 Thread Jakub Jelinek via Gcc-patches

On Tue, Feb 16, 2021 at 03:03:56PM +, Richard Sandiford via Gcc-patches 
wrote:
> > On Tue, Feb 16, 2021 at 01:09:43PM +, Richard Sandiford wrote:
> >> Can I put in a plea to put this in recog.[hc], and possibly also make
> >> it a copy constructor for recog_data_d?  I can't think of any legitimate
> >> cases in which we'd want to copy the whole structure, instead of just
> >> the active parts.
> >
> > Good idea.  So like this?
> >
> > 2021-02-16  Jakub Jelinek  
> >
> > PR target/99104
> > * recog.h (recog_data_d::recog_data_d ()): New defaulted default ctor.
> > (recog_data_d::recog_data_d (const recog_data_d &)): Declare.
> > (recog_data_d::operator = (const recog_data_d &)): Declare.
> > * recog.c (recog_data_d::operator = (const recog_data_d &)): New
> > assignment operator.
> > (recog_data_d::recog_data_d (const recog_data_d &)): New copy ctor.
> 
> OK, thanks.

Unfortunately, that caused regressions:
+FAIL: gcc.target/i386/pad-3.c (internal compiler error)
+FAIL: gcc.target/i386/pad-3.c (test for excess errors)
+UNRESOLVED: gcc.target/i386/pad-3.c scan-assembler-not nop
+UNRESOLVED: gcc.target/i386/pad-3.c scan-assembler-not rep
+FAIL: gcc.target/i386/pr93611.c (internal compiler error)
+FAIL: gcc.target/i386/pr93611.c (test for excess errors)

The thing is that both generated split_insns and peephole2_insns functions
start with:
  rtx * const operands ATTRIBUTE_UNUSED = &recog_data.operand[0];
...
  recog_data.insn = NULL;
and then mix various tests with operands[???] = ...; statements.
So, the recog.[ch] added copy ctor and assignment operator can be perhaps
useful for other cases but not for the splitter/peephole2 conditions.

So, either we can still use those and then copy rest of operands array
(there is nothing that tells how many are valid), like in the patch below
that passed x86_64-linux and i686-linux bootstrap and regtests, i.e.
  /* Remember recog_data content.  */
  struct recog_data_d recog_data_save = recog_data;
  /* While recog_data normally contains just n_operands operands in
 operand array, during split_all_insns or peephole2 it can contain
 arbitrary number of operands that are being matched.  */
  for (int i = recog_data.n_operands; i < MAX_RECOG_OPERANDS; i++)
recog_data_save.operand[i] = recog_data.operand[i];

  dist_define = distance_non_agu_define (regno1, regno2, insn);
  dist_use = distance_agu_use (regno0, insn);

  /* distance_non_agu_define can call get_attr_type which can call
 recog_memoized, restore recog_data back to previous content.  */
  recog_data = recog_data_save;
  for (int i = recog_data_save.n_operands; i < MAX_RECOG_OPERANDS; i++)
recog_data.operand[i] = recog_data_save.operand[i];
or we could save less by:
  rtx operands[MAX_RECOG_OPERANDS];
  memcpy (operands, recog_data.operand, sizeof operands);

  dist_define = distance_non_agu_define (regno1, regno2, insn);
  dist_use = distance_agu_use (regno0, insn);

  memcpy (recog_data.operand, operands, sizeof operands);
  recog_data.insn = NULL;
and up to you whether the recog.[hc] changes should be kept or not.

In any case, I'm surprised why what the splitters did so far worked at all,
because it isn't always the case that the split_insns or peephole2_insns
filled operands don't really need to match what extract_insn would fill.

Sorry for this taking so long.

2021-02-17  Jakub Jelinek  

PR target/99104
* recog.h (recog_data_d::recog_data_d ()): New defaulted default ctor.
(recog_data_d::recog_data_d (const recog_data_d &)): Declare.
(recog_data_d::operator = (const recog_data_d &)): Declare.
* recog.c (recog_data_d::operator = (const recog_data_d &)): New
assignment operator.
(recog_data_d::recog_data_d (const recog_data_d &)): New copy ctor.
* config/i386/i386.c (distance_non_agu_define): Don't call
extract_insn_cached here.
(ix86_lea_outperforms): Save and restore recog_data around call
to distance_non_agu_define and distance_agu_use.
(ix86_ok_to_clobber_flags): Remove.
(ix86_avoid_lea_for_add): Don't call ix86_ok_to_clobber_flags.
(ix86_avoid_lea_for_addr): Likewise.  Adjust function comment.
* config/i386/i386.m (*lea): Change from define_insn_and_split
into define_insn.  Move the splitting to define_peephole2 and
check there using peep2_regno_dead_p if FLAGS_REG is dead.

* gcc.dg/pr99104.c: New test.

--- gcc/recog.h.jj  2021-01-15 19:04:42.436445517 +0100
+++ gcc/recog.h 2021-02-16 14:26:00.384934711 +0100
@@ -372,6 +372,12 @@ struct recog_data_d

   /* In case we are caching, hold insn data was generated for.  */
   rtx_insn *insn;
+
+  recog_data_d () = default;
+  /* Copy constructor and assignment operator, so that when copying the
+ structure we copy only the active elements.  */
+  recog_data_d (const recog_data_d &);
+  recog_data_d &operator = (const recog_data_d &);
 };

 extern st

Re: [patch, fortran] PR96686 Namelist group objects shall be defined before appearing in namelist

2021-02-17 Thread Tobias Burnus


Hi Jerry,

I note that you have not written that testcase and I am still half
aspleep, but I fail to see what's wrong with the following program
(before and after your change):

f2 looks like a local and implicitly typed real variable. At least ifort
compiles this program successfully.

F2018 has: "A namelist group object shall either be accessed by use or
host association or shall have its declared type, kind type parameters
of the declared type, and rank specified by previous specification
statements or the procedure heading in the same scoping unit or by the
implicit typing rules in effect for the scoping unit. If a namelist
group object is typed by the implicit typing rules, its appearance in
any subsequent type declaration statement shall confirm the implied type
and type parameters."

Tobias

On 17.02.21 04:02, Jerry DeLisle wrote:

index 538bceaa4b6..4e021253f01 100644
--- a/gcc/testsuite/gfortran.dg/namelist_4.f90
+++ b/gcc/testsuite/gfortran.dg/namelist_4.f90
@@ -27,14 +27,14 @@ END module M1
  program P1
  CONTAINS
  ! This has the additional wrinkle of a reference to the object.
+  INTEGER FUNCTION F2()
+F2=1
+  END FUNCTION
INTEGER FUNCTION F1()
  NAMELIST /NML3/ F2 ! { dg-error "PROCEDURE attribute conflicts" }
  ! Used to ICE here
-f2 = 1 ! { dg-error "is not a VALUE" }
+f2 = 1 ! { dg-error "is not a variable" }
  F1=1
END FUNCTION
-  INTEGER FUNCTION F2()
-F2=1
-  END FUNCTION
  END

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf

Re: [PATCH] profiling: fix streaming of TOPN counters

2021-02-17 Thread Martin Liška


On 2/16/21 9:46 PM, Richard Biener wrote:

Looks like this can only shrink the race window.


Yes, kind of.


Don't you need atomic accesses or locking here? Can the number of counters 
change?


So what we do: we want to stream out N TOPN counters where each looks like this:

{total_count, number_of_tracked_values, pointer_to_first_tracked_value }

Where tracked values live in a linked list and we always append to a linked 
list.
What happens is that number of tracked values in a linked list does not match
number_of_tracked_values.

I'll write it even more robust...

IIRC the counters are dynamically allocated. Are they ever reallocated?

No, they live in a linked list.

Martin

51 matches

Mail list logo