Re: [RFC] Possible folding opportunities for string built-ins

2016-10-13 Thread Richard Biener
On Wed, Oct 12, 2016 at 3:48 PM, Martin Liška  wrote:
> Hi.
>
> As you probably mentioned, simple folding improvement has grown to multiple 
> patches
> and multiple iterations. Apart from that, I also noticed that we do not do 
> the best
> for couple of cases and I would like to have a feedback if it worth to 
> improve or not?
>
> $ cat /tmp/string-folding-missing.c
> const char global_1[4] = {'a', 'b', 'c', 'd' };
> const char global_2[6] = "abcdefghijk";
>
> int main()
> {
>   const char local1[] = "asdfasdfasdf";
>
>   /* Case 1 */
>   __builtin_memchr (global_1, 'c', 5);
>
>   /* Case 2 */
>   __builtin_memchr (global_2, 'c', 5);
>
>   /* Case 3 */
>   __builtin_memchr (local1, 'a', 5);
>
>   return 0;
> }
>
> Cases:
> 1) Currently, calling c_getstr (which calls string_constant) can't handle 
> CONSTRUCTOR. Potential
> solution can be to create on demand STRING_CST, however as string_constant is 
> called multiple times,
> it can be overkill.

I believe somewhere during GENERICIZation / GIMPLIFICATION we should
simply turn those
COSNTRUCTORs into STRING_CSTs ... (probably not in string_constant
itself as that would be somewhat
gross of a place).

> 2) /tmp/x.c:2:26: warning: initializer-string for array of chars is too 
> long
>  const char global_2[6] = "abcdefghijk";
> Here I'm not sure whether one can consider global_2 == "abcdef" (w/o trailing 
> zero char) or not?
> If so, adding new output argument (string_length) to string_constant can be 
> solution.

Likewise if we are able to warn the FE should be able to truncate the
STRING_CST itself.  The
question is still whether a non-NULL terminated string should be
constant folded (it looks like
the STRING_PTR in a STRING_CST is always '\0' terminated).

> 3) Currently, ctor_for_folding return error_mark_node for local variables. 
> I'm wondering whether returning
> DECL_INITIAL for these would be doable? Will it make any issue for LTO?

They do not prevail (ok, you might see this during GENERIC folding).
They get lowered to runtime
initialization either from a constant or a CONST_DECL (with
DECL_INITIAL).  For those CONST_DECLs
we should return a ctor_for_folding.

Richard.

> Last question is whether one can aggressively fold strcasecmp in a host 
> compiler? Or are there any situations
> where results depends on locale?
>
> Thanks for thoughts.
> Martin


Re: [RFC] Possible folding opportunities for string built-ins

2016-10-12 Thread Jim Wilson

On 10/12/2016 08:55 AM, Joseph Myers wrote:

On Wed, 12 Oct 2016, Martin Liška wrote:


Last question is whether one can aggressively fold strcasecmp in a host
compiler? Or are there any situations where results depends on locale?


There are the usual issues with Turkish locales having the uppercase
version of 'i' being 'İ' and the lowercase version of 'I' being 'ı'.


See for instance
  https://en.wikipedia.org/wiki/Dotted_and_dotless_I

Jim



Re: [RFC] Possible folding opportunities for string built-ins

2016-10-12 Thread Joseph Myers
On Wed, 12 Oct 2016, Martin Liška wrote:

> Last question is whether one can aggressively fold strcasecmp in a host 
> compiler? Or are there any situations where results depends on locale?

There are the usual issues with Turkish locales having the uppercase 
version of 'i' being 'İ' and the lowercase version of 'I' being 'ı'.

-- 
Joseph S. Myers
jos...@codesourcery.com

[RFC] Possible folding opportunities for string built-ins

2016-10-12 Thread Martin Liška
Hi.

As you probably mentioned, simple folding improvement has grown to multiple 
patches
and multiple iterations. Apart from that, I also noticed that we do not do the 
best
for couple of cases and I would like to have a feedback if it worth to improve 
or not?

$ cat /tmp/string-folding-missing.c 
const char global_1[4] = {'a', 'b', 'c', 'd' };
const char global_2[6] = "abcdefghijk";

int main()
{
  const char local1[] = "asdfasdfasdf";

  /* Case 1 */
  __builtin_memchr (global_1, 'c', 5);

  /* Case 2 */
  __builtin_memchr (global_2, 'c', 5);

  /* Case 3 */
  __builtin_memchr (local1, 'a', 5);

  return 0;
}

Cases:
1) Currently, calling c_getstr (which calls string_constant) can't handle 
CONSTRUCTOR. Potential
solution can be to create on demand STRING_CST, however as string_constant is 
called multiple times,
it can be overkill.
2) /tmp/x.c:2:26: warning: initializer-string for array of chars is too long
 const char global_2[6] = "abcdefghijk";
Here I'm not sure whether one can consider global_2 == "abcdef" (w/o trailing 
zero char) or not?
If so, adding new output argument (string_length) to string_constant can be 
solution.
3) Currently, ctor_for_folding return error_mark_node for local variables. I'm 
wondering whether returning
DECL_INITIAL for these would be doable? Will it make any issue for LTO?

Last question is whether one can aggressively fold strcasecmp in a host 
compiler? Or are there any situations
where results depends on locale?

Thanks for thoughts.
Martin