Re: [PATCH] expr: build string_constant only for a char type

2020-07-28 Thread Martin Liška

On 7/28/20 9:00 AM, Richard Biener via Gcc-patches wrote:

On Mon, Jul 27, 2020 at 10:49 PM Jakub Jelinek via Gcc-patches
 wrote:


On Mon, Jul 27, 2020 at 09:53:31AM -0600, Martin Sebor via Gcc-patches wrote:

   Return a pointer P to a NUL-terminated string containing
   the sequence of bytes corresponding to the representation
   of the object referred to by SRC (or a subsequence of such
   bytes within it if SRC is a reference to an initialized
   constant array plus some constant offset).

I.e., c_getstr returns a STRING_CST for arbitrary non-string
constants.  This enables optimizations like the by-pieces
expansion of calls to raw memory functions like memcpy, or
the folding of other raw memory calls like memchr with non-
string arguments.

c_getstr relies on string_constant for that.  Restricting
the latter function to just character types prevents these
optimizations for zero-initialized constants of other types.
A test case that shows the difference to the by-pieces
expansion goes something like this:


Having STRING_CST in the compiler with arbitrary array type is IMHO a very
bad idea, so if you want something like that, you should come up with a
different representation for that, not STRING_CSTs.
Because most of the compiler assumes STRING_CSTs are what it says, string
literals where elements are some kind of characters, don't have to be
narrow, but better should be integral.
Maybe returning a CONSTRUCTOR with no elements with the right type is a
better idea for that, that in the compiler stands for zero initialized
aggregate.


It's also a much shorter representation (that also works for strings, btw)
if it is all about all-zero "constants".

Richard.


 Jakub



Based on the upcoming discussion, I decided to backport my current fix to 
releases/gcc-10
branch in order to fix the problematic ICE in chromium. I'm ready to backport 
whatever
Martin comes up with.

Martin


Re: [PATCH] expr: build string_constant only for a char type

2020-07-28 Thread Richard Biener via Gcc-patches
On Mon, Jul 27, 2020 at 10:49 PM Jakub Jelinek via Gcc-patches
 wrote:
>
> On Mon, Jul 27, 2020 at 09:53:31AM -0600, Martin Sebor via Gcc-patches wrote:
> >   Return a pointer P to a NUL-terminated string containing
> >   the sequence of bytes corresponding to the representation
> >   of the object referred to by SRC (or a subsequence of such
> >   bytes within it if SRC is a reference to an initialized
> >   constant array plus some constant offset).
> >
> > I.e., c_getstr returns a STRING_CST for arbitrary non-string
> > constants.  This enables optimizations like the by-pieces
> > expansion of calls to raw memory functions like memcpy, or
> > the folding of other raw memory calls like memchr with non-
> > string arguments.
> >
> > c_getstr relies on string_constant for that.  Restricting
> > the latter function to just character types prevents these
> > optimizations for zero-initialized constants of other types.
> > A test case that shows the difference to the by-pieces
> > expansion goes something like this:
>
> Having STRING_CST in the compiler with arbitrary array type is IMHO a very
> bad idea, so if you want something like that, you should come up with a
> different representation for that, not STRING_CSTs.
> Because most of the compiler assumes STRING_CSTs are what it says, string
> literals where elements are some kind of characters, don't have to be
> narrow, but better should be integral.
> Maybe returning a CONSTRUCTOR with no elements with the right type is a
> better idea for that, that in the compiler stands for zero initialized
> aggregate.

It's also a much shorter representation (that also works for strings, btw)
if it is all about all-zero "constants".

Richard.

> Jakub
>


Re: [PATCH] expr: build string_constant only for a char type

2020-07-27 Thread Jakub Jelinek via Gcc-patches
On Mon, Jul 27, 2020 at 09:53:31AM -0600, Martin Sebor via Gcc-patches wrote:
>   Return a pointer P to a NUL-terminated string containing
>   the sequence of bytes corresponding to the representation
>   of the object referred to by SRC (or a subsequence of such
>   bytes within it if SRC is a reference to an initialized
>   constant array plus some constant offset).
> 
> I.e., c_getstr returns a STRING_CST for arbitrary non-string
> constants.  This enables optimizations like the by-pieces
> expansion of calls to raw memory functions like memcpy, or
> the folding of other raw memory calls like memchr with non-
> string arguments.
> 
> c_getstr relies on string_constant for that.  Restricting
> the latter function to just character types prevents these
> optimizations for zero-initialized constants of other types.
> A test case that shows the difference to the by-pieces
> expansion goes something like this:

Having STRING_CST in the compiler with arbitrary array type is IMHO a very
bad idea, so if you want something like that, you should come up with a
different representation for that, not STRING_CSTs.
Because most of the compiler assumes STRING_CSTs are what it says, string
literals where elements are some kind of characters, don't have to be
narrow, but better should be integral.
Maybe returning a CONSTRUCTOR with no elements with the right type is a
better idea for that, that in the compiler stands for zero initialized
aggregate.

Jakub



Re: [PATCH] expr: build string_constant only for a char type

2020-07-27 Thread Martin Sebor via Gcc-patches

On 7/27/20 12:54 PM, Martin Liška wrote:

On 7/27/20 5:53 PM, Martin Sebor wrote:

The tests I committed with the change didn't exercise any of
this so that's my bad.  I'm still not sure I understand how
the problem with the incomplete type comes up (I haven't had
a chance to look into the recent updates on the bug yet) but
to retain the optimization (and keep the comments in sync
with the code) I think a better solution than restricting
the function to integers is to limit it to complete types.
Beyond that, extending the function to also constant arrays
or nonzero aggregates will also enable the optimization for
those.


Hello.

I must admit that I'm not super-familiar with that code I modified.
Can you please assign the PR and propose a proper fix? I can then
test it on chromium and I'm also deferring backport of the patch I 
installed

to master.


Sure.  I've been trying to wrap something up and it's been taking
longer than I expected.  I'll look into this as soon as I'm done,
hopefully tomorrow.

Martin


Re: [PATCH] expr: build string_constant only for a char type

2020-07-27 Thread Martin Liška

On 7/27/20 5:53 PM, Martin Sebor wrote:

The tests I committed with the change didn't exercise any of
this so that's my bad.  I'm still not sure I understand how
the problem with the incomplete type comes up (I haven't had
a chance to look into the recent updates on the bug yet) but
to retain the optimization (and keep the comments in sync
with the code) I think a better solution than restricting
the function to integers is to limit it to complete types.
Beyond that, extending the function to also constant arrays
or nonzero aggregates will also enable the optimization for
those.


Hello.

I must admit that I'm not super-familiar with that code I modified.
Can you please assign the PR and propose a proper fix? I can then
test it on chromium and I'm also deferring backport of the patch I installed
to master.

Thanks,
Martin


Re: [PATCH] expr: build string_constant only for a char type

2020-07-27 Thread Martin Sebor via Gcc-patches

On 7/27/20 6:32 AM, Martin Liška wrote:

Hey.

As mentioned in the PR, we should not create a string constant for a type
that is different from char_type_node. Looking at expr.c, I was inspired
and used 'TYPE_MAIN_VARIANT (chartype) == char_type_node' to verify that 
underlying

type is a character type.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests. 
And it fixes chromium

build with gcc-10 branch with the patch applied.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

 PR tree-optimization/96058
 * expr.c (string_constant): Build string_constant only
 for a type that is main variant of char_type_node.
---
  gcc/expr.c | 22 +-
  1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/gcc/expr.c b/gcc/expr.c
index 5db0a7a8565..c3fdd82b319 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -11828,17 +11828,21 @@ string_constant (tree arg, tree *ptr_offset, 
tree *mem_size, tree *decl)

  chartype = TREE_TYPE (chartype);
    while (TREE_CODE (chartype) == ARRAY_TYPE)
  chartype = TREE_TYPE (chartype);
-  /* Convert a char array to an empty STRING_CST having an array
- of the expected type and size.  */
-  if (!initsize)
-  initsize = integer_zero_node;

-  unsigned HOST_WIDE_INT size = tree_to_uhwi (initsize);
-  init = build_string_literal (size, NULL, chartype, size);
-  init = TREE_OPERAND (init, 0);
-  init = TREE_OPERAND (init, 0);
+  if (TYPE_MAIN_VARIANT (chartype) == char_type_node)


The change to c_getstr I recently committed made it clear that
the function can:

  Return a pointer P to a NUL-terminated string containing
  the sequence of bytes corresponding to the representation
  of the object referred to by SRC (or a subsequence of such
  bytes within it if SRC is a reference to an initialized
  constant array plus some constant offset).

I.e., c_getstr returns a STRING_CST for arbitrary non-string
constants.  This enables optimizations like the by-pieces
expansion of calls to raw memory functions like memcpy, or
the folding of other raw memory calls like memchr with non-
string arguments.

c_getstr relies on string_constant for that.  Restricting
the latter function to just character types prevents these
optimizations for zero-initialized constants of other types.
A test case that shows the difference to the by-pieces
expansion goes something like this:

  const struct { char a[64]; } x = { 0 };

  void f (void *d)
  {
__builtin_memcpy (d, , sizeof x - 1);
  }

A test case for the memchr folding is similar:

  const struct { char a[64]; } x = { 0 };

  int f (void *d)
  {
return __builtin_memchr (, 0, sizeof x) != 0;
  }

The tests I committed with the change didn't exercise any of
this so that's my bad.  I'm still not sure I understand how
the problem with the incomplete type comes up (I haven't had
a chance to look into the recent updates on the bug yet) but
to retain the optimization (and keep the comments in sync
with the code) I think a better solution than restricting
the function to integers is to limit it to complete types.
Beyond that, extending the function to also constant arrays
or nonzero aggregates will also enable the optimization for
those.

Martin


+    {
+  /* Convert a char array to an empty STRING_CST having an array
+ of the expected type and size.  */
+  if (!initsize)
+    initsize = integer_zero_node;
+
+  unsigned HOST_WIDE_INT size = tree_to_uhwi (initsize);
+  init = build_string_literal (size, NULL, chartype, size);
+  init = TREE_OPERAND (init, 0);
+  init = TREE_OPERAND (init, 0);

-  *ptr_offset = integer_zero_node;
+  *ptr_offset = integer_zero_node;
+    }
  }

    if (decl)




Re: [PATCH] expr: build string_constant only for a char type

2020-07-27 Thread Jakub Jelinek via Gcc-patches
On Mon, Jul 27, 2020 at 04:12:09PM +0200, Martin Liška wrote:
> On 7/27/20 3:16 PM, Jakub Jelinek wrote:
> > On Mon, Jul 27, 2020 at 02:32:15PM +0200, Martin Liška wrote:
> > > As mentioned in the PR, we should not create a string constant for a type
> > > that is different from char_type_node. Looking at expr.c, I was inspired
> > > and used 'TYPE_MAIN_VARIANT (chartype) == char_type_node' to verify that 
> > > underlying
> > > type is a character type.
> > 
> > That doesn't look correct, there is char, signed char, unsigned char,
> > or say std::byte, and all of them are perfectly fine.
> > So, rather than requiring it is char and nothing else, you should instead
> > check that it is an INTEGRAL_TYPE_P (maybe better other than BOOLEAN_TYPE?),
> > which is complete and has the same TYPE_PRECISION as char_type_node.
> 
> All right, the following survives tests and bootstraps.

LGTM.

Jakub



Re: [PATCH] expr: build string_constant only for a char type

2020-07-27 Thread Martin Liška

On 7/27/20 3:16 PM, Jakub Jelinek wrote:

On Mon, Jul 27, 2020 at 02:32:15PM +0200, Martin Liška wrote:

As mentioned in the PR, we should not create a string constant for a type
that is different from char_type_node. Looking at expr.c, I was inspired
and used 'TYPE_MAIN_VARIANT (chartype) == char_type_node' to verify that 
underlying
type is a character type.


That doesn't look correct, there is char, signed char, unsigned char,
or say std::byte, and all of them are perfectly fine.
So, rather than requiring it is char and nothing else, you should instead
check that it is an INTEGRAL_TYPE_P (maybe better other than BOOLEAN_TYPE?),
which is complete and has the same TYPE_PRECISION as char_type_node.


All right, the following survives tests and bootstraps.

Ready to be installed?
Thanks,
Martin



Jakub



>From d40967eb2ef27b512ae177b1aee5e85ac2246acd Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Mon, 27 Jul 2020 12:30:24 +0200
Subject: [PATCH] expr: build string_constant only for a char type

gcc/ChangeLog:

	PR tree-optimization/96058
	* expr.c (string_constant): Build string_constant only
	for a type that has same precision as char_type_node
	and is an integral type.
---
 gcc/expr.c | 23 ++-
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/gcc/expr.c b/gcc/expr.c
index 5db0a7a8565..a150fa0d3b5 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -11828,17 +11828,22 @@ string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree *decl)
 	chartype = TREE_TYPE (chartype);
   while (TREE_CODE (chartype) == ARRAY_TYPE)
 	chartype = TREE_TYPE (chartype);
-  /* Convert a char array to an empty STRING_CST having an array
-	 of the expected type and size.  */
-  if (!initsize)
-	  initsize = integer_zero_node;
 
-  unsigned HOST_WIDE_INT size = tree_to_uhwi (initsize);
-  init = build_string_literal (size, NULL, chartype, size);
-  init = TREE_OPERAND (init, 0);
-  init = TREE_OPERAND (init, 0);
+  if (INTEGRAL_TYPE_P (chartype)
+	  && TYPE_PRECISION (chartype) == TYPE_PRECISION (char_type_node))
+	{
+	  /* Convert a char array to an empty STRING_CST having an array
+	 of the expected type and size.  */
+	  if (!initsize)
+	initsize = integer_zero_node;
+
+	  unsigned HOST_WIDE_INT size = tree_to_uhwi (initsize);
+	  init = build_string_literal (size, NULL, chartype, size);
+	  init = TREE_OPERAND (init, 0);
+	  init = TREE_OPERAND (init, 0);
 
-  *ptr_offset = integer_zero_node;
+	  *ptr_offset = integer_zero_node;
+	}
 }
 
   if (decl)
-- 
2.27.0



Re: [PATCH] expr: build string_constant only for a char type

2020-07-27 Thread Jakub Jelinek via Gcc-patches
On Mon, Jul 27, 2020 at 02:32:15PM +0200, Martin Liška wrote:
> As mentioned in the PR, we should not create a string constant for a type
> that is different from char_type_node. Looking at expr.c, I was inspired
> and used 'TYPE_MAIN_VARIANT (chartype) == char_type_node' to verify that 
> underlying
> type is a character type.

That doesn't look correct, there is char, signed char, unsigned char,
or say std::byte, and all of them are perfectly fine.
So, rather than requiring it is char and nothing else, you should instead
check that it is an INTEGRAL_TYPE_P (maybe better other than BOOLEAN_TYPE?),
which is complete and has the same TYPE_PRECISION as char_type_node.

Jakub



[PATCH] expr: build string_constant only for a char type

2020-07-27 Thread Martin Liška

Hey.

As mentioned in the PR, we should not create a string constant for a type
that is different from char_type_node. Looking at expr.c, I was inspired
and used 'TYPE_MAIN_VARIANT (chartype) == char_type_node' to verify that 
underlying
type is a character type.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests. And it 
fixes chromium
build with gcc-10 branch with the patch applied.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

PR tree-optimization/96058
* expr.c (string_constant): Build string_constant only
for a type that is main variant of char_type_node.
---
 gcc/expr.c | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/gcc/expr.c b/gcc/expr.c
index 5db0a7a8565..c3fdd82b319 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -11828,17 +11828,21 @@ string_constant (tree arg, tree *ptr_offset, tree 
*mem_size, tree *decl)
chartype = TREE_TYPE (chartype);
   while (TREE_CODE (chartype) == ARRAY_TYPE)
chartype = TREE_TYPE (chartype);
-  /* Convert a char array to an empty STRING_CST having an array
-of the expected type and size.  */
-  if (!initsize)
- initsize = integer_zero_node;
 
-  unsigned HOST_WIDE_INT size = tree_to_uhwi (initsize);

-  init = build_string_literal (size, NULL, chartype, size);
-  init = TREE_OPERAND (init, 0);
-  init = TREE_OPERAND (init, 0);
+  if (TYPE_MAIN_VARIANT (chartype) == char_type_node)
+   {
+ /* Convert a char array to an empty STRING_CST having an array
+of the expected type and size.  */
+ if (!initsize)
+   initsize = integer_zero_node;
+
+ unsigned HOST_WIDE_INT size = tree_to_uhwi (initsize);
+ init = build_string_literal (size, NULL, chartype, size);
+ init = TREE_OPERAND (init, 0);
+ init = TREE_OPERAND (init, 0);
 
-  *ptr_offset = integer_zero_node;

+ *ptr_offset = integer_zero_node;
+   }
 }
 
   if (decl)

--
2.27.0