Re: [PATCH] PR 78534 Change character length from int to size_t

2018-01-08 Thread Bob Deen

On 1/3/18 11:43 AM, Janne Blomqvist wrote:

On Wed, Jan 3, 2018 at 8:34 PM, Bob Deen  wrote:

On 12/29/17 5:31 AM, Janne Blomqvist wrote:


In order to handle large character lengths on (L)LP64 targets, switch
the GFortran character length from an int to a size_t.

This is an ABI change, as procedures with character arguments take
hidden arguments with the character length.



Did this change not make it into gcc 7 then?


No, it caused regressions on big endian targets, and it was so late in
the gcc 7 cycle that there was no time to fix them (at the time I
didn't have access to a big endian target to test on, and getting said
access took time), so I had to revert it. Those regressions have now
been fixed, and the ABI has been broken anyway due to other changes,
so I'm trying again for gcc 8.


Okay.  If for some reason it doesn't make 8, it would be nice to know. 
I lurk here and try to pay attention but obviously I missed that it was 
pulled from 7.  (which is why I'd prefer something specific to test 
rather than a version number, but it is what it is).


Thanks...

-Bob







  I am one of those who still
use these hidden arguments for Fortran <-> C interfaces.  Based on
discussions a year ago, I added this to my code:

#if defined(__GNUC__) && (__GNUC__ > 6)
#include 
#define FORSTR_STDARG_TYPE size_t
#else
#define FORSTR_STDARG_TYPE int
#endif

I infer from this thread that I should change this to __GNUC__ > 7 now. Is
this still the correct/best way to determine the hidden argument size?


Yes, I would say so.


(note that we're still using 4.x... ugh, don't ask... so the >6 check hasn't
actually been used yet, I just want to future-proof things as much as
possible without having to rewrite the entire Fortran <-> C interface.)

Thanks...

-Bob

Bob Deen  @  NASA-JPL Multmission Image Processing Lab
bob.d...@jpl.nasa.gov









Re: [PATCH] PR 78534 Change character length from int to size_t

2018-01-03 Thread Janne Blomqvist
On Thu, Jan 4, 2018 at 4:21 AM, Jerry DeLisle  wrote:
> On 01/03/2018 03:37 AM, Janne Blomqvist wrote:
>> On Sat, Dec 30, 2017 at 10:58 PM, Jerry DeLisle  
>> wrote:
>>> On 12/30/2017 12:35 PM, Janne Blomqvist wrote:
 On Sat, Dec 30, 2017 at 7:16 PM, Thomas Koenig  
 wrote:
>>> ---snip---

 I can provide that stuff as a separate patch, or merge it into the
 original megapatch and resubmit that, whichever way you prefer.
>>>
>>> I would prefer we split into two patches. This will make review of the 
>>> library
>>> I/O changes easier. The int len is used in a lot of places also where it 
>>> really
>>> happens to also be the kind (which is a length in our implementation).
>>
>> Hi,
>>
>> attached is a patch that makes the two attached testcases work. It
>> applies on top of the charlen->size_t patch. In the formatted I/O
>> stuff, I have mostly used ptrdiff_t to avoid having to deal with
>> signed/unsigned issues, as the previous code was using int.
>>
>>
>>
>
> Well I have started reviewing.  One concern I have is that in many places you
> are changing formatted field widths, like w in read_l, to ptrdiff_t.  I don't
> think this is necessary. If someone is printing a value that has a field width
> that big, it will never finish.
>
> I did not check the definitions of format data structures that use these 
> values
> all over.  So I think in the least, the patch goes too far.

I don't expect to support format widths larger than INT_MAX.  The
reason why ptrdiff_t is used is that read_block_form_* and many
similar functions take a pointer to a ptrdiff_t (to handle the one
case where a scalar variable can be wider than INT_MAX, namely a
character), and we can't pass a pointer to an incompatible type.

Now, you could argue that the internal API should be refactored so
that we pass integers by value rather than as pointers, and then only
the character case would need to care to use a larger type and the
rest could keep using plain int. And I would agree that would be good,
but I think such a refactor would be stage 1 material, not stage 3.

> Another issue I have is that the readability of the code just sucks to me. The
> type ptrdiff_t is so not intuitive in a very many places.  If this is really
> necessary lets hide it behind  a macro or something so I don't have to cringe
> every time I look at this code. (such as gfc_int)
>
> Sometimes we can get too pedantic about things like this. A lot of these
> variables have nothing to do with pointers, though they may be used as 
> modifiers
> to offsets in large memory spaces.

Well, ptrdiff_t is one of the two "pointer-sized integer" typedefs
introduced in C89, the other being size_t. So I would argue that if
one is doing stuff like pointer arithmetic, or calculating array
indices etc., and the variable might for some reason be negative,
ptrdiff_t is the natural type to use. I do agree that ptrdiff_t is a
bit of a mouthful, but I don't have any really good alternatives
either; I'm not convinced that a libgfortran-specific typedef would
make things clearer.

Or we can blame Microsoft which made win64 LLP64 and not LP64 like the
rest of the world, and thus spoiled the use of long for portable code.
:)

FWIW, we have "index_type" which is a typedef for ptrdiff_t, but
that's used mostly in code that deals with array descriptors, so I
thought it would be cleaner to not use it here.


-- 
Janne Blomqvist


Re: [PATCH] PR 78534 Change character length from int to size_t

2018-01-03 Thread Jerry DeLisle
On 01/03/2018 03:37 AM, Janne Blomqvist wrote:
> On Sat, Dec 30, 2017 at 10:58 PM, Jerry DeLisle  wrote:
>> On 12/30/2017 12:35 PM, Janne Blomqvist wrote:
>>> On Sat, Dec 30, 2017 at 7:16 PM, Thomas Koenig  
>>> wrote:
>> ---snip---
>>>
>>> I can provide that stuff as a separate patch, or merge it into the
>>> original megapatch and resubmit that, whichever way you prefer.
>>
>> I would prefer we split into two patches. This will make review of the 
>> library
>> I/O changes easier. The int len is used in a lot of places also where it 
>> really
>> happens to also be the kind (which is a length in our implementation).
> 
> Hi,
> 
> attached is a patch that makes the two attached testcases work. It
> applies on top of the charlen->size_t patch. In the formatted I/O
> stuff, I have mostly used ptrdiff_t to avoid having to deal with
> signed/unsigned issues, as the previous code was using int.
> 
> 
> 

Well I have started reviewing.  One concern I have is that in many places you
are changing formatted field widths, like w in read_l, to ptrdiff_t.  I don't
think this is necessary. If someone is printing a value that has a field width
that big, it will never finish.

I did not check the definitions of format data structures that use these values
all over.  So I think in the least, the patch goes too far.

Another issue I have is that the readability of the code just sucks to me. The
type ptrdiff_t is so not intuitive in a very many places.  If this is really
necessary lets hide it behind  a macro or something so I don't have to cringe
every time I look at this code. (such as gfc_int)

Sometimes we can get too pedantic about things like this. A lot of these
variables have nothing to do with pointers, though they may be used as modifiers
to offsets in large memory spaces.

I need a day or two to go through all of this. Sorry if I am just a downer 
about it.

Regards,

Jerry


Re: [PATCH] PR 78534 Change character length from int to size_t

2018-01-03 Thread Janne Blomqvist
On Wed, Jan 3, 2018 at 8:34 PM, Bob Deen  wrote:
> On 12/29/17 5:31 AM, Janne Blomqvist wrote:
>>
>> In order to handle large character lengths on (L)LP64 targets, switch
>> the GFortran character length from an int to a size_t.
>>
>> This is an ABI change, as procedures with character arguments take
>> hidden arguments with the character length.
>
>
> Did this change not make it into gcc 7 then?

No, it caused regressions on big endian targets, and it was so late in
the gcc 7 cycle that there was no time to fix them (at the time I
didn't have access to a big endian target to test on, and getting said
access took time), so I had to revert it. Those regressions have now
been fixed, and the ABI has been broken anyway due to other changes,
so I'm trying again for gcc 8.

>  I am one of those who still
> use these hidden arguments for Fortran <-> C interfaces.  Based on
> discussions a year ago, I added this to my code:
>
> #if defined(__GNUC__) && (__GNUC__ > 6)
> #include 
> #define FORSTR_STDARG_TYPE size_t
> #else
> #define FORSTR_STDARG_TYPE int
> #endif
>
> I infer from this thread that I should change this to __GNUC__ > 7 now. Is
> this still the correct/best way to determine the hidden argument size?

Yes, I would say so.

> (note that we're still using 4.x... ugh, don't ask... so the >6 check hasn't
> actually been used yet, I just want to future-proof things as much as
> possible without having to rewrite the entire Fortran <-> C interface.)
>
> Thanks...
>
> -Bob
>
> Bob Deen  @  NASA-JPL Multmission Image Processing Lab
> bob.d...@jpl.nasa.gov
>



-- 
Janne Blomqvist


Re: [PATCH] PR 78534 Change character length from int to size_t

2018-01-03 Thread Bob Deen

On 12/29/17 5:31 AM, Janne Blomqvist wrote:

In order to handle large character lengths on (L)LP64 targets, switch
the GFortran character length from an int to a size_t.

This is an ABI change, as procedures with character arguments take
hidden arguments with the character length.


Did this change not make it into gcc 7 then?  I am one of those who 
still use these hidden arguments for Fortran <-> C interfaces.  Based on 
discussions a year ago, I added this to my code:


#if defined(__GNUC__) && (__GNUC__ > 6)
#include 
#define FORSTR_STDARG_TYPE size_t
#else
#define FORSTR_STDARG_TYPE int
#endif

I infer from this thread that I should change this to __GNUC__ > 7 now. 
Is this still the correct/best way to determine the hidden argument size?


(note that we're still using 4.x... ugh, don't ask... so the >6 check 
hasn't actually been used yet, I just want to future-proof things as 
much as possible without having to rewrite the entire Fortran <-> C 
interface.)


Thanks...

-Bob

Bob Deen  @  NASA-JPL Multmission Image Processing Lab
bob.d...@jpl.nasa.gov



Re: [PATCH] PR 78534 Change character length from int to size_t

2018-01-03 Thread Janne Blomqvist
On Wed, Jan 3, 2018 at 2:10 PM, Thomas Koenig  wrote:
> Hi Janne,
>
>> attached is a patch that makes the two attached testcases work. It
>> applies on top of the charlen->size_t patch. In the formatted I/O
>> stuff, I have mostly used ptrdiff_t to avoid having to deal with
>> signed/unsigned issues, as the previous code was using int.
>
>
> Did you regression-test?

Ah yes, I forgot to mention that. Yes, I did, though only on x86_64-linux-gnu

> If yes, I'd say this patch is OK (all the changes look obvious
> enough).
>
> With this, your character length patch is also OK. We can then
> open individual PRs for the other issues.
>
> However, I'd ask you to wait for a day or so with committing
> so that other people also have a chance for a (final) look
> at this.

Thanks! Dominique mentioned on IRC about some incoming comments, so I
guess it makes sense to wait a few days.



-- 
Janne Blomqvist


Re: [PATCH] PR 78534 Change character length from int to size_t

2018-01-03 Thread Thomas Koenig

Hi Janne,


attached is a patch that makes the two attached testcases work. It
applies on top of the charlen->size_t patch. In the formatted I/O
stuff, I have mostly used ptrdiff_t to avoid having to deal with
signed/unsigned issues, as the previous code was using int.


Did you regression-test?

If yes, I'd say this patch is OK (all the changes look obvious
enough).

With this, your character length patch is also OK. We can then
open individual PRs for the other issues.

However, I'd ask you to wait for a day or so with committing
so that other people also have a chance for a (final) look
at this.

Regards

Thomas


Re: [PATCH] PR 78534 Change character length from int to size_t

2018-01-03 Thread Janne Blomqvist
On Sat, Dec 30, 2017 at 10:58 PM, Jerry DeLisle  wrote:
> On 12/30/2017 12:35 PM, Janne Blomqvist wrote:
>> On Sat, Dec 30, 2017 at 7:16 PM, Thomas Koenig  wrote:
> ---snip---
>>
>> I can provide that stuff as a separate patch, or merge it into the
>> original megapatch and resubmit that, whichever way you prefer.
>
> I would prefer we split into two patches. This will make review of the library
> I/O changes easier. The int len is used in a lot of places also where it 
> really
> happens to also be the kind (which is a length in our implementation).

Hi,

attached is a patch that makes the two attached testcases work. It
applies on top of the charlen->size_t patch. In the formatted I/O
stuff, I have mostly used ptrdiff_t to avoid having to deal with
signed/unsigned issues, as the previous code was using int.



-- 
Janne Blomqvist
From 72799c5587ee1e830bb424278286cff309d0c4a7 Mon Sep 17 00:00:00 2001
From: Janne Blomqvist 
Date: Tue, 2 Jan 2018 14:17:57 +0200
Subject: [PATCH] Use ptrdiff_t for formatted I/O sizes

---
 libgfortran/io/fbuf.c  | 44 ++---
 libgfortran/io/fbuf.h  | 16 +--
 libgfortran/io/io.h| 16 +--
 libgfortran/io/list_read.c | 15 +-
 libgfortran/io/read.c  | 70 +++---
 libgfortran/io/transfer.c  | 36 
 libgfortran/io/unix.c  | 20 ++---
 libgfortran/io/unix.h  | 12 
 libgfortran/io/write.c | 21 +++---
 9 files changed, 126 insertions(+), 124 deletions(-)

diff --git a/libgfortran/io/fbuf.c b/libgfortran/io/fbuf.c
index 944469d..d38d003 100644
--- a/libgfortran/io/fbuf.c
+++ b/libgfortran/io/fbuf.c
@@ -33,7 +33,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 
 
 void
-fbuf_init (gfc_unit *u, int len)
+fbuf_init (gfc_unit *u, ptrdiff_t len)
 {
   if (len == 0)
 len = 512;			/* Default size.  */
@@ -64,9 +64,9 @@ fbuf_debug (gfc_unit *u, const char *format, ...)
   va_start(args, format);
   vfprintf(stderr, format, args);
   va_end(args);
-  fprintf (stderr, "fbuf_debug pos: %d, act: %d, buf: ''", 
-   u->fbuf->pos, u->fbuf->act);
-  for (int ii = 0; ii < u->fbuf->act; ii++)
+  fprintf (stderr, "fbuf_debug pos: %ld, act: %ld, buf: ''",
+   (long) u->fbuf->pos, (long) u->fbuf->act);
+  for (ptrdiff_t ii = 0; ii < u->fbuf->act; ii++)
 {
   putc (u->fbuf->buf[ii], stderr);
 }
@@ -84,10 +84,10 @@ fbuf_debug (gfc_unit *u __attribute__ ((unused)),
underlying device.  Returns how much the physical position was
modified.  */
 
-int
+ptrdiff_t
 fbuf_reset (gfc_unit *u)
 {
-  int seekval = 0;
+  ptrdiff_t seekval = 0;
 
   if (!u->fbuf)
 return 0;
@@ -99,7 +99,7 @@ fbuf_reset (gfc_unit *u)
   if (u->mode == READING && u->fbuf->act > u->fbuf->pos)
 {
   seekval = - (u->fbuf->act - u->fbuf->pos);
-  fbuf_debug (u, "fbuf_reset seekval %d, ", seekval);
+  fbuf_debug (u, "fbuf_reset seekval %ld, ", (long) seekval);
 }
   u->fbuf->act = u->fbuf->pos = 0;
   return seekval;
@@ -111,11 +111,11 @@ fbuf_reset (gfc_unit *u)
reallocating if necessary.  */
 
 char *
-fbuf_alloc (gfc_unit *u, int len)
+fbuf_alloc (gfc_unit *u, ptrdiff_t len)
 {
-  int newlen;
+  ptrdiff_t newlen;
   char *dest;
-  fbuf_debug (u, "fbuf_alloc len %d, ", len);
+  fbuf_debug (u, "fbuf_alloc len %ld, ", (long) len);
   if (u->fbuf->pos + len > u->fbuf->len)
 {
   /* Round up to nearest multiple of the current buffer length.  */
@@ -138,7 +138,7 @@ fbuf_alloc (gfc_unit *u, int len)
 int
 fbuf_flush (gfc_unit *u, unit_mode mode)
 {
-  int nwritten;
+  ptrdiff_t nwritten;
 
   if (!u->fbuf)
 return 0;
@@ -177,7 +177,7 @@ fbuf_flush (gfc_unit *u, unit_mode mode)
 int
 fbuf_flush_list (gfc_unit *u, unit_mode mode)
 {
-  int nwritten;
+  ptrdiff_t nwritten;
 
   if (!u->fbuf)
 return 0;
@@ -206,8 +206,8 @@ fbuf_flush_list (gfc_unit *u, unit_mode mode)
 }
 
 
-int
-fbuf_seek (gfc_unit *u, int off, int whence)
+ptrdiff_t
+fbuf_seek (gfc_unit *u, ptrdiff_t off, int whence)
 {
   if (!u->fbuf)
 return -1;
@@ -226,7 +226,7 @@ fbuf_seek (gfc_unit *u, int off, int whence)
   return -1;
 }
 
-  fbuf_debug (u, "fbuf_seek, off %d ", off);
+  fbuf_debug (u, "fbuf_seek, off %ld ", (long) off);
   /* The start of the buffer is always equal to the left tab
  limit. Moving to the left past the buffer is illegal in C and
  would also imply moving past the left tab limit, which is never
@@ -248,21 +248,21 @@ fbuf_seek (gfc_unit *u, int off, int whence)
of bytes actually processed. */
 
 char *
-fbuf_read (gfc_unit *u, int *len)
+fbuf_read (gfc_unit *u, ptrdiff_t *len)
 {
   char *ptr;
-  int oldact, oldpos;
-  int readlen = 0;
+  ptrdiff_t oldact, oldpos;
+  ptrdiff_t readlen = 0;
 
-  fbuf_debug (u, "fbuf_read, len %d: ", *len);
+  fbuf_debug (u, "fbuf_read, len %ld: ", (long) *len);
   oldact = u->fbuf->act;
   oldpos = u->fbuf->pos;
   ptr = fbuf_alloc (u, *l

Re: [PATCH] PR 78534 Change character length from int to size_t

2017-12-31 Thread Thomas Koenig

Hi Janne,


FWIW, by changing your example to use unformatted I/O, it works
correctly.


Not for me (again, on gcc110):

program main
  character(len=2_8**33), parameter :: a = ""
  write (10) a
end program main

with strace results in

open("fort.10", O_RDWR|O_CREAT|O_CLOEXEC, 0666) = 3
fstat(3, {st_mode=S_IFREG|0664, st_size=8, ...}) = 0
write(3, "\0\0\0\0\0\0\0\0", 8) = 8
ftruncate(3, 8) = 0
close(3)= 0

Regards

Thomas


Re: [PATCH] PR 78534 Change character length from int to size_t

2017-12-30 Thread Jerry DeLisle
On 12/30/2017 12:35 PM, Janne Blomqvist wrote:
> On Sat, Dec 30, 2017 at 7:16 PM, Thomas Koenig  wrote:
---snip---
> 
> I can provide that stuff as a separate patch, or merge it into the
> original megapatch and resubmit that, whichever way you prefer.

I would prefer we split into two patches. This will make review of the library
I/O changes easier. The int len is used in a lot of places also where it really
happens to also be the kind (which is a length in our implementation).

Janne I trust your judgment about where it makes sense to change to size_t.

On Thomas earlier comment about adding a pointer to void to hold space for
async. I am OK with this too, though I am not convinced we have really defined
our async objective here.  More later on that topic.

Regards,

Jerry



Re: [PATCH] PR 78534 Change character length from int to size_t

2017-12-30 Thread Janne Blomqvist
On Sat, Dec 30, 2017 at 7:16 PM, Thomas Koenig  wrote:
> Hi Janne,
>
>> To be honest, I haven't really done much testing with big strings, so
>> far my focus has been on getting the existing testsuite to pass and
>> getting the ABI right.
>
>
> I agree that some of the test cases can be fixed later. However, I
> would really prefer if the I/O worked, because that is very basic
> functionality, but also because this is (likely to be) an ABI issue.
> If this bug remains unfixed for any reason at all, then we are left with
> no choice but to break the ABI when we fix that bug. I would like to
> avoid that, if possible.

Fair enough. I took a look at the I/O example you provided, and at
least that particular case is not due to an ABI issue, but rather that
the formatted I/O stuff inside libgfortran extensively uses int for
lengths. I managed to hack around it quickly to make your testcase
work, but a proper fix, while straightforward, implies fixing up the
types a bit more. But still only in the internals, the external ABI
visible interface is ok.

I can provide that stuff as a separate patch, or merge it into the
original megapatch and resubmit that, whichever way you prefer.

FWIW, by changing your example to use unformatted I/O, it works
correctly. And as unformatted uses the same library entry points for
transferring the data, this is thus further evidence that the problem
is in the internals of the formatted I/O rather than in the ABI.



-- 
Janne Blomqvist


Re: [PATCH] PR 78534 Change character length from int to size_t

2017-12-30 Thread Thomas Koenig

Hi Janne,


To be honest, I haven't really done much testing with big strings, so
far my focus has been on getting the existing testsuite to pass and
getting the ABI right.


I agree that some of the test cases can be fixed later. However, I
would really prefer if the I/O worked, because that is very basic
functionality, but also because this is (likely to be) an ABI issue.
If this bug remains unfixed for any reason at all, then we are left with
no choice but to break the ABI when we fix that bug. I would like to
avoid that, if possible.

By the way, we also should forsee a few more ABI-breaking things
while we're at it. We should

- Put a "reserved" field in the array descriptor for things like
  "did this come from an ALLOCATE statement or not", there is a PR
  for this
- Put a pointer to void into the I/O structures, which we are certain
  to need for async I/O
- Increase the maximum number of array dimensions to 15, as per f2008
- Insert a "BACK" argument in minloc, maxloc, minval, maxval, even
  if we do not use it at the moment


As the library ABI has been broken for GCC 8
already by other changes, I'd like to piggyback this ABI change in for
the GCC 8 release as well. As the patch is already pretty big as is,
I'd prefer that other fixes to enable big strings would be done as
separate patches rather than trying to make everything perfect on the
first try.


I tend to concur for the other bugs, but not for the I/O issue.

Regards

Thomas


Re: [PATCH] PR 78534 Change character length from int to size_t

2017-12-30 Thread Janne Blomqvist
On Sat, Dec 30, 2017 at 4:59 PM, Thomas Koenig  wrote:
> That's all I could find for the moment. I will continue looking.
> Thanks for tackling this!

Thanks for testing!

To be honest, I haven't really done much testing with big strings, so
far my focus has been on getting the existing testsuite to pass and
getting the ABI right. As the library ABI has been broken for GCC 8
already by other changes, I'd like to piggyback this ABI change in for
the GCC 8 release as well. As the patch is already pretty big as is,
I'd prefer that other fixes to enable big strings would be done as
separate patches rather than trying to make everything perfect on the
first try.

Slightly related to your testcases, I think it would make sense to
create some separate "GFortran huge" testsuite, where we could collect
testcases that need lots of memory, or cpu time, or disk space and can
thus not be part of the default testsuite.

-- 
Janne Blomqvist


Re: [PATCH] PR 78534 Change character length from int to size_t

2017-12-30 Thread Thomas Koenig

Hi Janne,


In order to handle large character lengths on (L)LP64 targets, switch
the GFortran character length from an int to a size_t.


Just running some tests on gcc110 (the big-endian PPC machine from
the compile farm).

Something does not seem to work with I/O with long strings.

With

program main
  character(len=2_8**33), parameter :: a = ""
  character(len=2_8**33) :: b
  print '(A),a
  b = ""
  print '(A)',b
end program main

I get

$ strace ./a.out > /dev/null
...
write(1, "\n", 1)   = 1
write(1, "\n", 1)   = 1

so I suspect there still is a 32-bit quantity for string lengths
lurking somewhere in the I/O library.

The following program causes a segfault on compilation:

program main
  integer(8), parameter :: n=2_8**32
  character(len=*), parameter :: a1 = repeat('x',n)
end program main

The program

program main
  integer(8), parameter :: n=2_8**32
  character(len=n), parameter :: a1 = repeat('x',n), a2 = repeat('y',n)
  character(len=*), parameter :: a3 = a1 // a2
end program main

is rejected with

tkoenig@gcc1-power7 String]$ gfortran  concat.f90
concat.f90:4:37:

   character(len=*), parameter :: a3 = a1 // a2
 1
Error: Function ‘a1’ in initialization expression at (1) must be an 
intrinsic function


That's all I could find for the moment. I will continue looking.
Thanks for tackling this!

Regards

Thomas


[PATCH] PR 78534 Change character length from int to size_t

2017-12-29 Thread Janne Blomqvist
In order to handle large character lengths on (L)LP64 targets, switch
the GFortran character length from an int to a size_t.

This is an ABI change, as procedures with character arguments take
hidden arguments with the character length.

I also changed the _size member in vtables from int to size_t, as
there were some cases where character lengths and sizes were
apparently mixed up and caused regressions otherwise. Although I
haven't tested, this might enable very large derived types as well.

Also, as there are some places in the frontend were negative character
lengths are used as special flag values, in the frontend the character
length is handled as a signed variable of the same size as a size_t,
although in the runtime library it really is size_t.

I haven't changed the character length variables for the co-array
intrinsics, as this is something that may need to be synchronized with
OpenCoarrays.

This is v5 of the patch. v4 was applied but caused breakage on big
endian targets. These have been fixed and tested, thanks to access to
the GCC compile farm.

Overview of v4 of the patch: v3 was applied but had to reverted due to
breaking bootstrap. The fix is in resolve.c:resolve_charlen, where
it's necessary to check that an expression is constant before using
mpz_sgn.

Overview of v3 of the patch: All the issues pointed out by FX's review
of v2 have been fixed. In particular, there are now new functions
gfc_mpz_get_hwi and gfc_mpz_set_hwi, similar to the GMP functions
mpz_get_si and mpz_set_si, except that they get/set a HOST_WIDE_INT
instead of a long value. Similarly, gfc_get_int_expr now takes a
HOST_WIDE_INT instead of a long, gfc_extract_long is replaced by
gfc_extract_hwi. Also, the preliminary work to handle
gfc_charlen_type_node being unsigned has been removed.

Regtested on x86_64-pc-linux-gnu, i686-pc-linux-gnu and
powerpc64-unknown-linux-gnu. Also regtested all three targets by
modifying gfortran-dg.exp to also test with "-g -flto", no new
failures observed.

frontend:

2017-12-29  Janne Blomqvist  

PR fortran/78534
PR fortran/66310
* array.c (got_charlen): Use gfc_charlen_int_kind.
* class.c (gfc_find_derived_vtab): Use gfc_size_kind instead of
hardcoded kind.
(find_intrinsic_vtab): Likewise.
* decl.c (match_char_length): Use gfc_charlen_int_kind.
(add_init_expr_to_sym): Use gfc_charlen_t and gfc_charlen_int_kind.
(gfc_match_implicit): Use gfc_charlen_int_kind.
* dump-parse-tree.c (show_char_const): Use gfc_charlen_t and size_t.
(show_expr): Use HOST_WIDE_INT_PRINT_DEC.
* expr.c (gfc_get_character_expr): Length parameter of type
gfc_charlen_t.
(gfc_get_int_expr): Value argument of type HOST_WIDE_INT.
(gfc_extract_hwi): New function.
(simplify_const_ref): Make string_len of type gfc_charlen_t.
(gfc_simplify_expr): Use HOST_WIDE_INT for substring refs.
* frontend-passes.c (optimize_trim): Use gfc_charlen_int_kind.
* gfortran.h (gfc_mpz_get_hwi): New prototype.
(gfc_mpz_set_hwi): Likewise.
(gfc_charlen_t): New typedef.
(gfc_expr): Use gfc_charlen_t for character lengths.
(gfc_size_kind): New extern variable.
(gfc_extract_hwi): New prototype.
(gfc_get_character_expr): Use gfc_charlen_t for character length.
(gfc_get_int_expr): Use HOST_WIDE_INT type for value argument.
* gfortran.texi: Update description of hidden string length argument.
* iresolve.c (check_charlen_present): Use gfc_charlen_int_kind.
(gfc_resolve_char_achar): Likewise.
(gfc_resolve_repeat): Pass string length directly without
temporary, use gfc_charlen_int_kind.
(gfc_resolve_transfer): Use gfc_charlen_int_kind.
* match.c (select_intrinsic_set_tmp): Use HOST_WIDE_INT for charlen.
* misc.c (gfc_mpz_get_hwi): New function.
(gfc_mpz_set_hwi): New function.
* module.c (atom_int): Change type from int to HOST_WIDE_INT.
(parse_integer): Don't complain about large integers.
(write_atom): Use HOST_WIDE_INT for integers.
(mio_integer): Handle integer type mismatch.
(mio_hwi): New function.
(mio_intrinsic_op): Use HOST_WIDE_INT.
(mio_array_ref): Likewise.
(mio_expr): Likewise.
* primary.c (match_substring): Use gfc_charlen_int_kind.
* resolve.c (resolve_substring_charlen): Use gfc_charlen_int_kind.
(resolve_character_operator): Likewise.
(resolve_assoc_var): Likewise.
(resolve_select_type): Use HOST_WIDE_INT for charlen, use snprintf.
(resolve_charlen): Use mpz_sgn to determine sign.
* simplify.c (gfc_simplify_repeat): Use HOST_WIDE_INT/gfc_charlen_t
instead of long.
* symbol.c (generate_isocbinding_symbol): Use gfc_charlen_int_kind.
* target-memory.c (size_character): Length argument of type
gfc_charlen_t.

Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-14 Thread Jerry DeLisle
On 01/14/2017 12:46 AM, Janne Blomqvist wrote:
> On Sat, Jan 14, 2017 at 1:13 AM, Jerry DeLisle  wrote:
>> On 01/13/2017 11:46 AM, David Edelsohn wrote:
>>> On Fri, Jan 13, 2017 at 12:09 PM, Janne Blomqvist
>>>  wrote:
 On Thu, Jan 12, 2017 at 10:46 AM, FX  wrote:
>> I was finally able to get a 32-bit i686 compiler going (my attempts to
>> do this on a x86_64-pc-linux-gnu host failed, in the end I resorted to
>> running 32-bit builds/tests on a i686 container). At least on i686,
>> the patch below on top of the big charlen->size_t patch fixes the
>> failures
>
> Patch approved. The old code used gfc_extract_int, which bails out if a 
> non-constant expression is passed, so this is the right thing to do.
>
> FX

 Committed as r28.

 David, can you verify that it works alright on ppc?
>>>
>>> Unfortunately, no.  This patch broke bootstrap again with the same
>>> failure as earlier.
>>>
>>> - David
>>>
>>
>> Janne, can you access gcc112 on the compile farm? If not, I can, maybe I can
>> test the patch.
>>
>> Jerry
> 
> No, I don't have such access. I'd appreciate it very much if you'd
> have a go at it.  Attached is the r28 patch combined with the
> patch to fix the sanitizer bugs found by Dominique.
> 

I will try to do this tonight

Jerry


Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-14 Thread Janne Blomqvist
On Sat, Jan 14, 2017 at 10:46 AM, Janne Blomqvist
 wrote:
> On Sat, Jan 14, 2017 at 1:13 AM, Jerry DeLisle  wrote:
>> On 01/13/2017 11:46 AM, David Edelsohn wrote:
>>> On Fri, Jan 13, 2017 at 12:09 PM, Janne Blomqvist
>>>  wrote:
 On Thu, Jan 12, 2017 at 10:46 AM, FX  wrote:
>> I was finally able to get a 32-bit i686 compiler going (my attempts to
>> do this on a x86_64-pc-linux-gnu host failed, in the end I resorted to
>> running 32-bit builds/tests on a i686 container). At least on i686,
>> the patch below on top of the big charlen->size_t patch fixes the
>> failures
>
> Patch approved. The old code used gfc_extract_int, which bails out if a 
> non-constant expression is passed, so this is the right thing to do.
>
> FX

 Committed as r28.

 David, can you verify that it works alright on ppc?
>>>
>>> Unfortunately, no.  This patch broke bootstrap again with the same
>>> failure as earlier.
>>>
>>> - David
>>>
>>
>> Janne, can you access gcc112 on the compile farm? If not, I can, maybe I can
>> test the patch.
>>
>> Jerry
>
> No, I don't have such access. I'd appreciate it very much if you'd
> have a go at it.  Attached is the r28 patch combined with the
> patch to fix the sanitizer bugs found by Dominique.
>
> --
> Janne Blomqvist

Since David had problems with module generation, I took a look at the
parts of the patch that changes module.c, and found a few places for
improvements. module.c is a collection of all the ugly type punning
tricks one can think of, in particular wrongly punning to a wider type
might cause a different result on a big endian machine, whereas on a
little endian it would apparently just work.. So could you also try
the attached small patch on top of the previous one?

(I also sent Laurent Guerby an email requesting an account on the compile farm).

-- 
Janne Blomqvist
diff --git a/gcc/fortran/module.c b/gcc/fortran/module.c
index ca53016..90b77ca 100644
--- a/gcc/fortran/module.c
+++ b/gcc/fortran/module.c
@@ -143,7 +143,7 @@ enum gfc_wsym_state
 typedef struct pointer_info
 {
   BBT_HEADER (pointer_info);
-  int integer;
+  HOST_WIDE_INT integer;
   pointer_t type;
 
   /* The first component of each member of the union is the pointer
@@ -372,7 +372,7 @@ get_pointer (void *gp)
creating the node if not found.  */
 
 static pointer_info *
-get_integer (int integer)
+get_integer (HOST_WIDE_INT integer)
 {
   pointer_info *p, t;
   int c;
@@ -472,7 +472,7 @@ associate_integer_pointer (pointer_info *p, void *gp)
sometime later.  Returns the pointer_info structure.  */
 
 static pointer_info *
-add_fixup (int integer, void *gp)
+add_fixup (HOST_WIDE_INT integer, void *gp)
 {
   pointer_info *p;
   fixup_t *f;
@@ -2729,7 +2729,8 @@ mio_pointer_ref (void *gp)
   if (iomode == IO_OUTPUT)
 {
   p = get_pointer (*((char **) gp));
-  write_atom (ATOM_INTEGER, &p->integer);
+  HOST_WIDE_INT hwi = p->integer;
+  write_atom (ATOM_INTEGER, &hwi);
 }
   else
 {
@@ -2766,18 +2767,18 @@ static void
 mio_component (gfc_component *c, int vtype)
 {
   pointer_info *p;
-  int n;
 
   mio_lparen ();
 
   if (iomode == IO_OUTPUT)
 {
   p = get_pointer (c);
-  mio_integer (&p->integer);
+  mio_hwi (&p->integer);
 }
   else
 {
-  mio_integer (&n);
+  HOST_WIDE_INT n;
+  mio_hwi (&n);
   p = get_integer (n);
   associate_integer_pointer (p, c);
 }
@@ -5924,7 +5925,7 @@ write_symtree (gfc_symtree *st)
 
   mio_pool_string (&st->name);
   mio_integer (&st->ambiguous);
-  mio_integer (&p->integer);
+  mio_hwi (&p->integer);
 }
 
 


Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-14 Thread Dominique d'Humières

> The following patch fixes these issues for me, does it work for you?

Yes, it does!

Dominique

> Janne Blomqvist



Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-14 Thread Janne Blomqvist
On Sat, Jan 14, 2017 at 1:13 AM, Jerry DeLisle  wrote:
> On 01/13/2017 11:46 AM, David Edelsohn wrote:
>> On Fri, Jan 13, 2017 at 12:09 PM, Janne Blomqvist
>>  wrote:
>>> On Thu, Jan 12, 2017 at 10:46 AM, FX  wrote:
> I was finally able to get a 32-bit i686 compiler going (my attempts to
> do this on a x86_64-pc-linux-gnu host failed, in the end I resorted to
> running 32-bit builds/tests on a i686 container). At least on i686,
> the patch below on top of the big charlen->size_t patch fixes the
> failures

 Patch approved. The old code used gfc_extract_int, which bails out if a 
 non-constant expression is passed, so this is the right thing to do.

 FX
>>>
>>> Committed as r28.
>>>
>>> David, can you verify that it works alright on ppc?
>>
>> Unfortunately, no.  This patch broke bootstrap again with the same
>> failure as earlier.
>>
>> - David
>>
>
> Janne, can you access gcc112 on the compile farm? If not, I can, maybe I can
> test the patch.
>
> Jerry

No, I don't have such access. I'd appreciate it very much if you'd
have a go at it.  Attached is the r28 patch combined with the
patch to fix the sanitizer bugs found by Dominique.

-- 
Janne Blomqvist


0001-PR-78534-Change-character-length-from-int-to-size_t.patch.gz
Description: GNU Zip compressed data


Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-14 Thread Janne Blomqvist
On Sun, Jan 8, 2017 at 4:29 PM, Dominique d'Humières  wrote:
>> r244027 reverts r244011. Sorry for the breakage. It seems to affect
>> all i686 as well in addition to power, maybe all 32-bit hosts.
>
> For the record, I see the following failures with an instrumented r244026 (as 
> in pr78672)
>
> FAIL: gfortran.dg/char_length_20.f90   -O*  execution test
> FAIL: gfortran.dg/char_length_21.f90   -O*  execution test
> FAIL: gfortran.dg/repeat_2.f90   -O1  execution test
> …
> FAIL: gfortran.dg/repeat_2.f90   -Os  execution test
> FAIL: gfortran.dg/widechar_6.f90   -O1  execution test
> …
> FAIL: gfortran.dg/widechar_6.f90   -Os  execution test
> FAIL: gfortran.dg/widechar_intrinsics_6.f90   -O*  execution test
>
> The run time failures are all of the kind

Sorry, I missed that I had to use an instrumented build to catch
these, and assumed everything was Ok once I managed to regtest cleanly
on i686.

The following patch fixes these issues for me, does it work for you?

diff --git a/libgfortran/intrinsics/string_intrinsics_inc.c
b/libgfortran/intrinsics/string_intrinsics_inc.c
index 0da5130..74a994b 100644
--- a/libgfortran/intrinsics/string_intrinsics_inc.c
+++ b/libgfortran/intrinsics/string_intrinsics_inc.c
@@ -177,23 +177,25 @@ string_trim (gfc_charlen_type *len, CHARTYPE
**dest, gfc_charlen_type slen,
 gfc_charlen_type
 string_len_trim (gfc_charlen_type len, const CHARTYPE *s)
 {
-  const gfc_charlen_type long_len = (gfc_charlen_type) sizeof (unsigned long);
-  gfc_charlen_type i;
+  if (len <= 0)
+return 0;
+
+  const size_t long_len = sizeof (unsigned long);

-  i = len - 1;
+  size_t i = len - 1;

   /* If we've got the standard (KIND=1) character type, we scan the string in
  long word chunks to speed it up (until a long word is hit that does not
  consist of ' 's).  */
   if (sizeof (CHARTYPE) == 1 && i >= long_len)
 {
-  int starting;
+  size_t starting;
   unsigned long blank_longword;

   /* Handle the first characters until we're aligned on a long word
 boundary.  Actually, s + i + 1 must be properly aligned, because
 s + i will be the last byte of a long word read.  */
-  starting = ((unsigned long)
+  starting = (
 #ifdef __INTPTR_TYPE__
  (__INTPTR_TYPE__)
 #endif





-- 
Janne Blomqvist


Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-13 Thread Jerry DeLisle
On 01/13/2017 11:46 AM, David Edelsohn wrote:
> On Fri, Jan 13, 2017 at 12:09 PM, Janne Blomqvist
>  wrote:
>> On Thu, Jan 12, 2017 at 10:46 AM, FX  wrote:
 I was finally able to get a 32-bit i686 compiler going (my attempts to
 do this on a x86_64-pc-linux-gnu host failed, in the end I resorted to
 running 32-bit builds/tests on a i686 container). At least on i686,
 the patch below on top of the big charlen->size_t patch fixes the
 failures
>>>
>>> Patch approved. The old code used gfc_extract_int, which bails out if a 
>>> non-constant expression is passed, so this is the right thing to do.
>>>
>>> FX
>>
>> Committed as r28.
>>
>> David, can you verify that it works alright on ppc?
> 
> Unfortunately, no.  This patch broke bootstrap again with the same
> failure as earlier.
> 
> - David
> 

Janne, can you access gcc112 on the compile farm? If not, I can, maybe I can
test the patch.

Jerry


Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-13 Thread David Edelsohn
On Fri, Jan 13, 2017 at 12:09 PM, Janne Blomqvist
 wrote:
> On Thu, Jan 12, 2017 at 10:46 AM, FX  wrote:
>>> I was finally able to get a 32-bit i686 compiler going (my attempts to
>>> do this on a x86_64-pc-linux-gnu host failed, in the end I resorted to
>>> running 32-bit builds/tests on a i686 container). At least on i686,
>>> the patch below on top of the big charlen->size_t patch fixes the
>>> failures
>>
>> Patch approved. The old code used gfc_extract_int, which bails out if a 
>> non-constant expression is passed, so this is the right thing to do.
>>
>> FX
>
> Committed as r28.
>
> David, can you verify that it works alright on ppc?

Unfortunately, no.  This patch broke bootstrap again with the same
failure as earlier.

- David


Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-13 Thread Janne Blomqvist
On Thu, Jan 12, 2017 at 10:46 AM, FX  wrote:
>> I was finally able to get a 32-bit i686 compiler going (my attempts to
>> do this on a x86_64-pc-linux-gnu host failed, in the end I resorted to
>> running 32-bit builds/tests on a i686 container). At least on i686,
>> the patch below on top of the big charlen->size_t patch fixes the
>> failures
>
> Patch approved. The old code used gfc_extract_int, which bails out if a 
> non-constant expression is passed, so this is the right thing to do.
>
> FX

Committed as r28.

David, can you verify that it works alright on ppc?



-- 
Janne Blomqvist


Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-12 Thread FX
> I was finally able to get a 32-bit i686 compiler going (my attempts to
> do this on a x86_64-pc-linux-gnu host failed, in the end I resorted to
> running 32-bit builds/tests on a i686 container). At least on i686,
> the patch below on top of the big charlen->size_t patch fixes the
> failures

Patch approved. The old code used gfc_extract_int, which bails out if a 
non-constant expression is passed, so this is the right thing to do.

FX

Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-11 Thread Janne Blomqvist
On Sun, Jan 8, 2017 at 4:29 PM, Dominique d'Humières  wrote:
>> r244027 reverts r244011. Sorry for the breakage. It seems to affect
>> all i686 as well in addition to power, maybe all 32-bit hosts.
>
> For the record, I see the following failures with an instrumented r244026 (as 
> in pr78672)

[snip]

I was finally able to get a 32-bit i686 compiler going (my attempts to
do this on a x86_64-pc-linux-gnu host failed, in the end I resorted to
running 32-bit builds/tests on a i686 container). At least on i686,
the patch below on top of the big charlen->size_t patch fixes the
failures:

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index be63038..82319ed 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -4726,7 +4726,7 @@ gfc_resolve_substring_charlen (gfc_expr *e)
   /* Length = (end - start + 1).  */
   e->ts.u.cl->length = gfc_subtract (end, start);
   e->ts.u.cl->length = gfc_add (e->ts.u.cl->length,
-   gfc_get_int_expr (gfc_default_integer_kind,
+   gfc_get_int_expr (gfc_charlen_int_kind,
  NULL, 1));

   /* F2008, 6.4.1:  Both the starting point and the ending point shall
@@ -11420,9 +11420,10 @@ resolve_charlen (gfc_charlen *cl)

   /* F2008, 4.4.3.2:  If the character length parameter value evaluates to
  a negative value, the length of character entities declared is zero.  */
-  if (cl->length && mpz_sgn (cl->length->value.integer) < 0)
+  if (cl->length && cl->length->expr_type == EXPR_CONSTANT
+  && mpz_sgn (cl->length->value.integer) < 0)
 gfc_replace_expr (cl->length,
- gfc_get_int_expr (gfc_default_integer_kind, NULL, 0));
+ gfc_get_int_expr (gfc_charlen_int_kind, NULL, 0));

   /* Check that the character length is not too large.  */
   k = gfc_validate_kind (BT_INTEGER, gfc_charlen_int_kind, false);


So what happened was that without the EXPR_CONSTANT check, I was
accessing uninitialized memory (for some reason probably due to memory
layout or such, this didn't cause failures on x86_64-pc-linux-gnu).

Also, I found a couple of additional places where gfc_charlen_int_kind
should be used instead of gfc_default_integer_kind which is included
in the patch above, although AFAICT they have nothing to do with the
testcase failures.

Unless there are objections, I'll commit the fixed patch in a few days.

-- 
Janne Blomqvist


Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-08 Thread Dominique d'Humières
> r244027 reverts r244011. Sorry for the breakage. It seems to affect
> all i686 as well in addition to power, maybe all 32-bit hosts.

For the record, I see the following failures with an instrumented r244026 (as 
in pr78672)

FAIL: gfortran.dg/char_length_20.f90   -O*  execution test
FAIL: gfortran.dg/char_length_21.f90   -O*  execution test
FAIL: gfortran.dg/repeat_2.f90   -O1  execution test
…
FAIL: gfortran.dg/repeat_2.f90   -Os  execution test
FAIL: gfortran.dg/widechar_6.f90   -O1  execution test
…
FAIL: gfortran.dg/widechar_6.f90   -Os  execution test
FAIL: gfortran.dg/widechar_intrinsics_6.f90   -O*  execution test

The run time failures are all of the kind

=
==43614==ERROR: AddressSanitizer: heap-buffer-overflow on address 
0x60200168 at pc 0x00010abfb578 bp 0x7fff55735250 sp 0x7fff55735248
READ of size 8 at 0x60200168 thread T0
#0 0x10abfb577 in _gfortran_string_len_trim 
(/opt/gcc/gcc7gp/lib/libgfortran.4.dylib+0x728577)
#1 0x10a4cae2c in MAIN__ 
(/Users/dominiq/Documents/Fortran/g95bench/win/f90/bug/a.out+0x10e2c)
#2 0x10a4caea6 in main 
(/Users/dominiq/Documents/Fortran/g95bench/win/f90/bug/a.out+0x10ea6)
#3 0x7fffbd674254 in start (/usr/lib/system/libdyld.dylib+0x5254)

0x60200168 is located 8 bytes to the left of 1-byte region 
[0x60200170,0x60200171)
allocated by thread T0 here:
#0 0x10c043319 in wrap_malloc (/opt/gcc/gcc7a/lib/libasan.4.dylib+0x61319)
#1 0x10a4cad11 in MAIN__ 
(/Users/dominiq/Documents/Fortran/g95bench/win/f90/bug/a.out+0x10d11)
#2 0x10a4caea6 in main 
(/Users/dominiq/Documents/Fortran/g95bench/win/f90/bug/a.out+0x10ea6)
#3 0x7fffbd674254 in start (/usr/lib/system/libdyld.dylib+0x5254)

SUMMARY: AddressSanitizer: heap-buffer-overflow 
(/opt/gcc/gcc7gp/lib/libgfortran.4.dylib+0x728577) in _gfortran_string_len_trim
Shadow bytes around the buggy address:
  0x1c03ffd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1c03ffe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1c03fff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1c04: fa fa fd fd fa fa fd fd fa fa 00 07 fa fa 00 06
  0x1c040010: fa fa 03 fa fa fa 00 00 fa fa 00 06 fa fa 06 fa
=>0x1c040020: fa fa 07 fa fa fa 07 fa fa fa fd fa fa[fa]01 fa
  0x1c040030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c040040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c040050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c040060: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c040070: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:   00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:   fa
  Freed heap region:   fd
  Stack left redzone:  f1
  Stack mid redzone:   f2
  Stack right redzone: f3
  Stack after return:  f5
  Stack use after scope:   f8
  Global redzone:  f9
  Global init order:   f6
  Poisoned by user:f7
  Container overflow:  fc
  Array cookie:ac
  Intra object redzone:bb
  ASan internal:   fe
  Left alloca redzone: ca
  Right alloca redzone:cb
==43614==ABORTING

Program received signal SIGABRT: Process abort signal.

Backtrace for this error:
#0  0x10a4d8558
#1  0x10a4d65f5
#2  0x7fffbd881bb9
Abort

Dominique



Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-03 Thread Janne Blomqvist
On Tue, Jan 3, 2017 at 9:21 PM, FX  wrote:
>> r244027 reverts r244011. Sorry for the breakage. It seems to affect
>> all i686 as well in addition to power, maybe all 32-bit hosts.
>
> The breakage is surprising, as the rejects-valid does not involve character 
> length at all.
> Jane, any chance you might have accidentally committed some unrelated change 
> along?

Yes, I'm surprised, and I don't understand it yet. I'm planning to
fire up a 32-bit VM and build there in order see what the issue is.

-- 
Janne Blomqvist


Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-03 Thread FX
> r244027 reverts r244011. Sorry for the breakage. It seems to affect
> all i686 as well in addition to power, maybe all 32-bit hosts.

The breakage is surprising, as the rejects-valid does not involve character 
length at all.
Jane, any chance you might have accidentally committed some unrelated change 
along?

FX

Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-03 Thread Janne Blomqvist
On Tue, Jan 3, 2017 at 4:07 PM, David Edelsohn  wrote:
> This patch broke bootstrap.  I now am seeing numerous errors when
> building libgomp.
>
> Please fix or revert immediately.

r244027 reverts r244011. Sorry for the breakage. It seems to affect
all i686 as well in addition to power, maybe all 32-bit hosts.





-- 
Janne Blomqvist


Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-03 Thread David Edelsohn
This patch broke bootstrap.  I now am seeing numerous errors when
building libgomp.

Please fix or revert immediately.

Thanks, David

omp_lib.f90:184:40:

 logical (4) :: omp_test_lock
1
Error: Symbol 'omp_test_lock' at (1) has already been host associated
omp_lib.f90:216:45:

 integer (4) :: omp_test_nest_lock
 1
Error: Symbol 'omp_test_nest_lock' at (1) has already been host associated
omp_lib.f90:329:61:

 integer (omp_proc_bind_kind) :: omp_get_proc_bind
 1
Error: Symbol 'omp_get_proc_bind' at (1) has already been host associated


and


/nasfarm/edelsohn/src/src/libgomp/openacc.f90:466:6:

   use openacc_internal
  1
Error: 'acc_device_nvidia' of module 'openacc_internal', imported at (1), is als
o the name of the current program unit
/nasfarm/edelsohn/src/src/libgomp/openacc.f90:466:6:

   use openacc_internal
  1
Error: Alternate return specifier in function 'acc_async_test_h' at (1) is not a
llowed
/nasfarm/edelsohn/src/src/libgomp/openacc.f90:466:6:

   use openacc_internal
  1
Error: Alternate return specifier in function 'acc_async_test_


[PATCH] PR 78534 Change character length from int to size_t

2017-01-01 Thread Janne Blomqvist
In order to handle large character lengths on (L)LP64 targets, switch
the GFortran character length from an int to a size_t.

This is an ABI change, as procedures with character arguments take
hidden arguments with the character length.

I also changed the _size member in vtables from int to size_t, as
there were some cases where character lengths and sizes were
apparently mixed up and caused regressions otherwise. Although I
haven't tested, this might enable very large derived types as well.

Also, as there are some places in the frontend were negative character
lengths are used as special flag values, in the frontend the character
length is handled as a signed variable of the same size as a size_t,
although in the runtime library it really is size_t.

I haven't changed the character length variables for the co-array
intrinsics, as this is something that may need to be synchronized with
OpenCoarrays.

Another change in this place is that the algorithm for
gfc_trans_string_copy has been rewritten to avoid a
-Wstringop-overflow warning.

This is v3 of the patch. All the issues pointed out by FX's review of
v2 have been fixed. In particular, there are now new functions
gfc_mpz_get_hwi and gfc_mpz_set_hwi, similar to the GMP functions
mpz_get_si and mpz_set_si, except that they get/set a HOST_WIDE_INT
instead of a long value. Similarly, gfc_get_int_expr now takes a
HOST_WIDE_INT instead of a long, gfc_extract_long is replaced by
gfc_extract_hwi. Also, the preliminary work to handle
gfc_charlen_type_node being unsigned has been removed.

Regtested on x86_64-pc-linux-gnu, Ok for trunk?

frontend:

2017-01-01  Janne Blomqvist  

PR fortran/78534
PR fortran/66310
* class.c (gfc_find_derived_vtab): Use gfc_size_kind instead of
hardcoded kind.
(find_intrinsic_vtab): Likewise.
* expr.c (gfc_get_character_expr): Length parameter of type
gfc_charlen_t.
(gfc_get_int_expr): Value argument of type HOST_WIDE_INT.
(gfc_extract_hwi): New function.
(simplify_const_ref): Make string_len of type gfc_charlen_t.
(gfc_simplify_expr): Use HOST_WIDE_INT for substring refs.
* gfortran.h (gfc_mpz_get_hwi): New prototype.
(gfc_mpz_set_hwi): Likewise.
(gfc_charlen_t): New typedef.
(gfc_expr): Use gfc_charlen_t for character lengths.
(gfc_size_kind): New extern variable.
(gfc_extract_hwi): New prototype.
(gfc_get_character_expr): Use gfc_charlen_t for character length.
(gfc_get_int_expr): Use HOST_WIDE_INT type for value argument.
* iresolve.c (gfc_resolve_repeat): Pass string length directly without
temporary, use gfc_charlen_int_kind.
* match.c (select_intrinsic_set_tmp): Use HOST_WIDE_INT for charlen.
* misc.c (gfc_mpz_get_hwi): New function.
(gfc_mpz_set_hwi): New function.
* module.c (atom_int): Change type from int to HOST_WIDE_INT.
(parse_integer): Don't complain about large integers.
(write_atom): Use HOST_WIDE_INT for integers.
(mio_integer): Handle integer type mismatch.
(mio_hwi): New function.
(mio_intrinsic_op): Use HOST_WIDE_INT.
(mio_array_ref): Likewise.
(mio_expr): Likewise.
* resolve.c (resolve_select_type): Use HOST_WIDE_INT for charlen,
use snprintf.
(resolve_charlen): Use mpz_sgn to determine sign.
* simplify.c (gfc_simplify_repeat): Use HOST_WIDE_INT/gfc_charlen_t
instead of long.
* target-memory.c (size_character): Length argument of type
gfc_charlen_t.
(gfc_encode_character): Likewise.
(gfc_interpret_character): Use gfc_charlen_t.
* target-memory.h (gfc_encode_character): Modify prototype.
* trans-array.c (get_array_ctor_var_strlen): Use
gfc_conv_mpz_to_tree_type.
* trans-const.c (gfc_conv_mpz_to_tree_type): New function.
* trans-const.h (gfc_conv_mpz_to_tree_type): New prototype.
* trans-expr.c (gfc_class_len_or_zero_get): Build const of type
gfc_charlen_type_node.
(gfc_conv_intrinsic_to_class): Use gfc_charlen_int_kind instead of
4, fold_convert to correct type.
(gfc_conv_class_to_class): Build const of type size_type_node for
size.
(gfc_copy_class_to_class): Likewise.
(gfc_conv_string_length): Use same type in expression.
(gfc_conv_substring): Likewise, use HOST_WIDE_INT for charlen.
(gfc_conv_string_tmp): Make sure len is of the right type.
(gfc_conv_concat_op): Use same type in expression.
(gfc_conv_procedure_call): Likewise.
(gfc_trans_string_copy): Rewrite to avoid -Wstringop-overflow
warning in generated code.
(alloc_scalar_allocatable_for_subcomponent_assignment):
fold_convert to right type.
(gfc_trans_subcomponent_assign): Likewise.
(trans_class_vptr_len_assignment): Build const of correct type.
 

Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-29 Thread Gerald Pfeifer
On Tue, 20 Dec 2016, FX wrote:
> Finally, if we’re making this change, we welcome any feedback on how 
> to make it as easy as possible to handle in user code. Documentation, 
> preprocessor macros, etc.

I believe including this in the (yet to be created) gcc-7/porting_to.html,
would be great.

Historically the porting_to.html documents have mostly covered C and 
C++, since that is the source language of the majority of packages in 
a GNU/Linux distribution that GCC touches.  Adding more focus on
Fortran users as well feels like a good idea, though.

(If you want to go ahead, but prefer the page to be created first,
let me know, and I'll take care.)

Gerald

Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-29 Thread Andre Vehreschild
Hi Janne, hi FX,

On Tue, 27 Dec 2016 12:56:19 +0200
Janne Blomqvist  wrote:

> >> I also changed the _size member in vtables from int to size_t, as
> >> there were some cases where character lengths and sizes were
> >> apparently mixed up and caused regressions otherwise. Although I

I can confirm this. Being responsible for adding the _len component for char
arrays in unlimited polymorphic objects. This is the only use case where the
_len component is used to my knowledge. The separation should have been:

- _size: store the size in bytes of a single character
- _len: the number of characters stored in the char array in the unlimited
  polymorphic object.

Unfortunately there were some case, which Janne also experienced, where these go
stray. I at least succeeded to remove the length from the vtab's-name that is
generated for storing in the unlimited polymorphic object. Over time I hope to
get the separation of concerns correctly modeled as told above, but for the
time being we have to stick with _size have the array size sometimes. I think
that is the case when a fixed length char array is stored in the unlimited
polymorphic object.

Regards,
Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-27 Thread FX
> The _size member is the size of the object; for polymorphic objects
> the size isn't known at compile time, so it is to be stored in the
> vtable (in principle, since the vtable tells the exact class of a
> polymorphic object, it should be possible to figure out the size
> without an explicit _size member, but I'll let the OOP experts handle
> that, if they want). There is another vtable member _len which is used
> for character lengths, and that is indeed of type
> gfc_charlen_type_node.

Understood, thanks.


> I think when the dust settles, I might make sense to get rid of
> gfc_charlen_type_node/gfc_charlen_type and just use
> size_type_node/size_t also for string lengths, but at least for now I
> think it's clearer to use separate names.

I agree with keeping them for now.


> So strictly speaking it's not necessary, as long as
> gfc_charlen_type_node is a signed integer.
> 
> OTOH, if/when one wants to make gfc_charlen_type_node unsigned,
> perhaps some more far-reaching changes are needed and that work is not
> necessary anymore. Say, introducing BT_UNSIGNED_INTEGER (??) and
> teaching gfc_typespec to handle unsigned integers? Or maybe it's
> better to put a tree node specifying the type in the typespec and use
> that instead of the bt type + kind to tell what type it is?
> 
> Do you want me to remove that for the time being?

Let’s remove that, at least for now.


>> There are other cases (resolve.c, simplify.c) where you introduce a 
>> dependency on middle-end entities (tree.h, trans-types.h) in what are pure 
>> Fortran front-end stages. This breaks the separation that currently exists, 
>> and which I strongly think we should keep.
> 
> These changes are similar to the above, i.e. a check that uses
> get_type_static_bounds() and works also if gfc_charlen_type_node is
> changed to be an unsigned type.

OK then let’s remove them too.



> - Should I remove the so-far preliminary work to handle
> gfc_charlen_type_node being unsigned?

I think it makes more sense.


> - Should I fix the uses of mpz_{get,set}_{s,u}i?

I think so, otherwise there is little reason to break the ABI and not support 
long strings :)


>> - trans-types.h: why do we now need to include trans.h?
> 
> IIRC this was due to some of the new .c files including trans-types.h
> but not trans.h and failing to compile. AFAIU the convention is that
> headers should include whatever is necessary to use said header. So
> this is some latent bug that has been exposed by my other changes.

If, with the final version of the patch, you can remove it, please do. And if 
you don’t, please remove the trans.h includes from source files that already 
include trans-types.h

FX



Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-27 Thread Janne Blomqvist
On Tue, Dec 27, 2016 at 12:47 PM, Andre Vehreschild  wrote:
> Hi Janne,
>
> sorry for being late in voicing my opinion, but I personally would prefer to
> have this patch in a separately. That way bisecting for performance
> regressions points only to the offending code and not to the change of the
> character length (the latter might add a tiny bit of cost, too).

No worries. I included this in the updated charlen-size_t patch I
posted and which FX reviewed, but I can certainly commit this part
separately.

>
> Regards,
> Andre
>
>
> On Fri, 23 Dec 2016 11:27:10 +0200
> Janne Blomqvist  wrote:
>
>> On Wed, Dec 21, 2016 at 3:14 PM, Andre Vehreschild  wrote:
>> >> Now when I think about this some more, I have a vague recollection
>> >> that a long time ago it used to be something like that.  The problem
>> >> is that MIN_EXPR will of course be
>> >> NON-CONSTANT, so the memcpy call can't be inlined. Hence it was
>> >> changed to two separate __builtin_memmove() calls to have better
>> >> opportunity to inline. So probably a no-go to change it back. :(
>> >
>> > I don't get that. From the former only one of the memmove's could have been
>> > inlined assuming that only CONSTANT sizes are inlined.
>>
>> Yes. Which is better than not being able to inline, assuming the
>> inlined branch is more likely to be taken, no?
>>
>> > The other one had a
>> > NON-CONSTANT as long as pointer following and constant propagation was not
>> > effective together. In our case the "NON-CONSTANT branch" would have been
>> > used, which is resolved by constant propagation (of the size of constant
>> > memory p points to). I assume the warning is triggered, because dead-code
>> > elimination has not removed the else part.
>>
>> Probably yes, in this case. My performance worries were more about the
>> general case. But perhaps they are unfounded, IDK really. Does anybody
>> have a battery of string operation benchmarks that mirrors how real
>> Fortran code does string handling?
>>
>> Anyway, the attached patch accomplishes that. It turns the tree dump I
>> showed two days ago into something like
>>
>>   {
>> integer(kind=8) D.3484;
>> unsigned long D.3485;
>>
>> D.3484 = *_p;
>> D.3485 = (unsigned long) D.3484 + 18446744073709551608;
>> if (D.3484 != 0)
>>   {
>> __builtin_memmove ((void *) *p, (void *) &"12345679"[1]{lb: 1
>> sz: 1}, MIN_EXPR <(unsigned long) D.3484, 8>);
>> if ((unsigned long) D.3484 > 8)
>>   {
>> __builtin_memset ((void *) *p + 8, 32, D.3485);
>>   }
>>   }
>>   }
>>
>> and gets rid of the -Wstringop-overflow warning. (It causes a two new
>> testsuite failures (gfortran.dg/dependency_49.f90 and
>> gfortran.dg/transfer_intrinsic_1.f90), but those are just tests that
>> grep for patterns in the .original dump and need to be adjusted).
>>
>> If that is Ok, do you prefer it as part of the charlen-size_t patch,
>> or would you (GFortran maintainers in general) prefer that it's a
>> separate patch?
>>
>
>
> --
> Andre Vehreschild * Email: vehre ad gmx dot de



-- 
Janne Blomqvist


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-27 Thread Janne Blomqvist
On Mon, Dec 26, 2016 at 12:32 PM, FX  wrote:
> Hi Janne,
>
> Thanks for the patch, it is hard and tedious work. Here is the formal review. 
> I don’t want to be a pain, but I have several questions about the patch, and 
> given its size and the importance I think we should be double-sure :)

Thanks for the in-depth review, I appreciate it!

>> I also changed the _size member in vtables from int to size_t, as
>> there were some cases where character lengths and sizes were
>> apparently mixed up and caused regressions otherwise. Although I
>> haven't tested, this might enable very large derived types as well.
>
> Regarding that one, why are you making it an explicit size_t and not a 
> charlen type? I know the two will be the same, at least for now, but given 
> that it’s explicitly a character length we should use that variable type. 
> This is a preexisting issue with the front-end and library, where we 
> generally use a mix of types (because they end up being the same anyway, such 
> as C int and GFC_INTEGER_4).

The _size member is the size of the object; for polymorphic objects
the size isn't known at compile time, so it is to be stored in the
vtable (in principle, since the vtable tells the exact class of a
polymorphic object, it should be possible to figure out the size
without an explicit _size member, but I'll let the OOP experts handle
that, if they want). There is another vtable member _len which is used
for character lengths, and that is indeed of type
gfc_charlen_type_node.

I think when the dust settles, I might make sense to get rid of
gfc_charlen_type_node/gfc_charlen_type and just use
size_type_node/size_t also for string lengths, but at least for now I
think it's clearer to use separate names.

> Regarding the introduction of is_charlen in gfc_typespec, I am unclear as to 
> why it is needed. It is used exclusively in arith.c, which is not where we 
> should be checking character lengths I think. It is visible by the fact that 
> we normally shouldn’t need access to middle-end headers (tree.h and 
> trans-types.h) at that level. So, can’t we make the check where we currently 
> do it, i.e. later when we translate the constant string? That sounds more 
> reasonable that introducing a new special-cased entity.

This is, well, a leftover from my attempts to make
gfc_charlen_type_node an alias for size_type_node, an unsigned type.
Since the gfc_typespec doesn't understand unsigned integers, it had to
be extended in some fashion.  You can see that the check in arith.c
checks that the charlen is in the range [0,
TYPE_MAX_VALUE(gfc_charlen_type_node)], which the "normal" check isn't
able to do.

So strictly speaking it's not necessary, as long as
gfc_charlen_type_node is a signed integer.

OTOH, if/when one wants to make gfc_charlen_type_node unsigned,
perhaps some more far-reaching changes are needed and that work is not
necessary anymore. Say, introducing BT_UNSIGNED_INTEGER (??) and
teaching gfc_typespec to handle unsigned integers? Or maybe it's
better to put a tree node specifying the type in the typespec and use
that instead of the bt type + kind to tell what type it is?

Do you want me to remove that for the time being?

> There are other cases (resolve.c, simplify.c) where you introduce a 
> dependency on middle-end entities (tree.h, trans-types.h) in what are pure 
> Fortran front-end stages. This breaks the separation that currently exists, 
> and which I strongly think we should keep.

These changes are similar to the above, i.e. a check that uses
get_type_static_bounds() and works also if gfc_charlen_type_node is
changed to be an unsigned type.

> ** libgfortran **
>
> - in io/write.c, the “for” clauses in in namelist_write() have weird spacing 
> around their semicolons (i.e. space should be after, not before)

Ah, I hadn't noticed that. As one can see, it's a pre-existing issue,
but I might as well fix it when I'm changing that line. Will do.

> - in intrinsics/extends_type_of.c, use gfc_charlen_type instead of size_t 
> vtype->size

As I explained earlier, this is because the size member is the size of
the object rather than the charlen, so I think it should stay a
size_t.

> ** front-end **
>
> - class.c: use gfc_charlen_int_kind instead of gfc_size_kind

Same here.

> - class.c, in find_intrinsic_vtab(): in the call to gfc_get_int_expr, the 
> third argument cannot be cast from (size_t) to (long), as this would fail on 
> LLP64 hosts

Yes, but... the issue is that gfc_get_int_expr uses mpz_set_si, so
can't handle more than long anyway. But hmm, maybe that typecast
should be removed anyway, so that when/if gfc_get_int_expr is fixed
this would be automatically fixed as well.

> - expr.c, regarding gfc_extract_long(): we could definitely extract an host 
> wide int or an mpz value. This new function is called twice: once in 
> resolve_charlen() where we could use the GMP function mpz_sgn() to check if 
> the constant value is negative; the second time in gfc_simplify_repeat, where 

Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-27 Thread Andre Vehreschild
Hi Janne,

sorry for being late in voicing my opinion, but I personally would prefer to
have this patch in a separately. That way bisecting for performance
regressions points only to the offending code and not to the change of the
character length (the latter might add a tiny bit of cost, too).

Regards,
Andre


On Fri, 23 Dec 2016 11:27:10 +0200
Janne Blomqvist  wrote:

> On Wed, Dec 21, 2016 at 3:14 PM, Andre Vehreschild  wrote:
> >> Now when I think about this some more, I have a vague recollection
> >> that a long time ago it used to be something like that.  The problem
> >> is that MIN_EXPR will of course be
> >> NON-CONSTANT, so the memcpy call can't be inlined. Hence it was
> >> changed to two separate __builtin_memmove() calls to have better
> >> opportunity to inline. So probably a no-go to change it back. :(  
> >
> > I don't get that. From the former only one of the memmove's could have been
> > inlined assuming that only CONSTANT sizes are inlined.  
> 
> Yes. Which is better than not being able to inline, assuming the
> inlined branch is more likely to be taken, no?
> 
> > The other one had a
> > NON-CONSTANT as long as pointer following and constant propagation was not
> > effective together. In our case the "NON-CONSTANT branch" would have been
> > used, which is resolved by constant propagation (of the size of constant
> > memory p points to). I assume the warning is triggered, because dead-code
> > elimination has not removed the else part.  
> 
> Probably yes, in this case. My performance worries were more about the
> general case. But perhaps they are unfounded, IDK really. Does anybody
> have a battery of string operation benchmarks that mirrors how real
> Fortran code does string handling?
> 
> Anyway, the attached patch accomplishes that. It turns the tree dump I
> showed two days ago into something like
> 
>   {
> integer(kind=8) D.3484;
> unsigned long D.3485;
> 
> D.3484 = *_p;
> D.3485 = (unsigned long) D.3484 + 18446744073709551608;
> if (D.3484 != 0)
>   {
> __builtin_memmove ((void *) *p, (void *) &"12345679"[1]{lb: 1
> sz: 1}, MIN_EXPR <(unsigned long) D.3484, 8>);
> if ((unsigned long) D.3484 > 8)
>   {
> __builtin_memset ((void *) *p + 8, 32, D.3485);
>   }
>   }
>   }
> 
> and gets rid of the -Wstringop-overflow warning. (It causes a two new
> testsuite failures (gfortran.dg/dependency_49.f90 and
> gfortran.dg/transfer_intrinsic_1.f90), but those are just tests that
> grep for patterns in the .original dump and need to be adjusted).
> 
> If that is Ok, do you prefer it as part of the charlen-size_t patch,
> or would you (GFortran maintainers in general) prefer that it's a
> separate patch?
> 


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-26 Thread FX
Hi Janne,

Thanks for the patch, it is hard and tedious work. Here is the formal review. I 
don’t want to be a pain, but I have several questions about the patch, and 
given its size and the importance I think we should be double-sure :)


> I also changed the _size member in vtables from int to size_t, as
> there were some cases where character lengths and sizes were
> apparently mixed up and caused regressions otherwise. Although I
> haven't tested, this might enable very large derived types as well.

Regarding that one, why are you making it an explicit size_t and not a charlen 
type? I know the two will be the same, at least for now, but given that it’s 
explicitly a character length we should use that variable type. This is a 
preexisting issue with the front-end and library, where we generally use a mix 
of types (because they end up being the same anyway, such as C int and 
GFC_INTEGER_4).

Regarding the introduction of is_charlen in gfc_typespec, I am unclear as to 
why it is needed. It is used exclusively in arith.c, which is not where we 
should be checking character lengths I think. It is visible by the fact that we 
normally shouldn’t need access to middle-end headers (tree.h and trans-types.h) 
at that level. So, can’t we make the check where we currently do it, i.e. later 
when we translate the constant string? That sounds more reasonable that 
introducing a new special-cased entity.

There are other cases (resolve.c, simplify.c) where you introduce a dependency 
on middle-end entities (tree.h, trans-types.h) in what are pure Fortran 
front-end stages. This breaks the separation that currently exists, and which I 
strongly think we should keep.



** libgfortran **

- in io/write.c, the “for” clauses in in namelist_write() have weird spacing 
around their semicolons (i.e. space should be after, not before)
- in intrinsics/extends_type_of.c, use gfc_charlen_type instead of size_t 
vtype->size

** front-end **

- class.c: use gfc_charlen_int_kind instead of gfc_size_kind
- class.c, in find_intrinsic_vtab(): in the call to gfc_get_int_expr, the third 
argument cannot be cast from (size_t) to (long), as this would fail on LLP64 
hosts
- expr.c, regarding gfc_extract_long(): we could definitely extract an host 
wide int or an mpz value. This new function is called twice: once in 
resolve_charlen() where we could use the GMP function mpz_sgn() to check if the 
constant value is negative; the second time in gfc_simplify_repeat, where we 
should simply bail out (return NULL) if the integer is too big to fit into a 
long (we would bail out a few lines later anyway, see “semi-arbitrary limit”).
- iresolve.c, extra space after NULL in call to gfc_get_int_expr() in 
gfc_resolve_repeat()
- match.c, in select_intrinsic_set_tmp(), charlen should be a gfc_charlen_t and 
mpz_get_si will break for long string sizes
- in resolve.c, like in arith.c, we should not use tree.h and trans-types.h. We 
should do the comparison by looking at integer kinds, not through the 
charlen_type_node
- in resolve.c, in resolve_select_type(), another case of mpz_get_si() that 
will break for long string sizes
- in simplify.c, again, we should not use tree.h and trans-types.h
- trans-decl.c seems like unrelated changes
- trans-types.h: why do we now need to include trans.h?


Thanks again for working on that!

FX

[PATCH] PR 78534 Change character length from int to size_t

2016-12-23 Thread Janne Blomqvist
In order to handle large character lengths on (L)LP64 targets, switch
the GFortran character length from an int to a size_t.

This is an ABI change, as procedures with character arguments take
hidden arguments with the character length.

I also changed the _size member in vtables from int to size_t, as
there were some cases where character lengths and sizes were
apparently mixed up and caused regressions otherwise. Although I
haven't tested, this might enable very large derived types as well.

Also, as there are some places in the frontend were negative character
lengths are used as special flag values, in the frontend the character
length is handled as a signed variable of the same size as a size_t,
although in the runtime library it really is size_t.

I haven't changed the character length variables for the co-array
intrinsics, as this is something that may need to be synchronized with
OpenCoarrays.

Another change in this place is that the algorithm for
gfc_trans_string_copy has been rewritten to avoid a
-Wstringop-overflow warning.

Regtested on x86_64-pc-linux-gnu, Ok for trunk?

frontend:

2016-12-23  Janne Blomqvist  

PR fortran/78534
PR fortran/66310
* arith.c (gfc_check_charlen_range): New function.
(gfc_range_check): Use gfc_check_charlen_range.
* class.c (gfc_find_derived_vtab): Use gfc_size_kind instead of
hardcoded kind.
(find_intrinsic_vtab): Likewise.
* expr.c (gfc_get_character_expr): Length parameter of type
gfc_charlen_t.
(gfc_get_int_expr): Value argument of type long.
(gfc_extract_long): New function.
* gfortran.h (gfc_typespec): New member is_charlen.
(gfc_charlen_t): New typedef.
(gfc_expr): Use gfc_charlen_t for character lengths.
(gfc_size_kind): New extern variable.
(gfc_extract_long): New prototype.
(gfc_get_character_expr): Use gfc_charlen_t for character length.
(gfc_get_int_expr): Use long type for value argument.
* iresolve.c (gfc_resolve_repeat): Use gfc_charlen_t,
gfc_charlen_int_kind, set is_charlen.
* match.c (select_intrinsic_set_tmp): Use long for charlen.
* module.c (atom_int): Change type from int to HOST_WIDE_INT.
(parse_integer): Don't complain about large integers.
(write_atom): Use HOST_WIDE_INT for integers.
(mio_integer): Handle integer type mismatch.
(mio_hwi): New function.
(mio_intrinsic_op): Use HOST_WIDE_INT.
(mio_array_ref): Likewise.
(mio_expr): Likewise.
* resolve.c (resolve_substring): Use get_type_static_bounds.
(resolve_select_type): Use long for charlen.
(resolve_charlen): Use long for charlen, get_type_static_bounds.
* simplify.c (gfc_simplify_repeat): Likewise.
* target-memory.c (gfc_interpret_character): Use gfc_charlen_t.
* trans-array.c (get_array_ctor_var_strlen): Use
gfc_conv_mpz_to_tree_type.
* trans-const.c (gfc_conv_mpz_to_tree_type): New function.
* trans-const.h (gfc_conv_mpz_to_tree_type): New prototype.
* trans-decl.c (create_function_arglist): Assert that length is
not NULL_TREE.
* trans-expr.c (gfc_class_len_or_zero_get): Build const of type
gfc_charlen_type_node.
(gfc_conv_intrinsic_to_class): Use gfc_charlen_int_kind instead of
4, fold_convert to correct type.
(gfc_conv_class_to_class): Build const of type size_type_node for
size.
(gfc_copy_class_to_class): Likewise.
(gfc_conv_string_length): Use same type in expression.
(gfc_conv_substring): Likewise, use long for charlen.
(gfc_conv_string_tmp): Make sure len is of the right type.
(gfc_conv_concat_op): Use same type in expression.
(gfc_conv_procedure_call): Likewise.
(gfc_trans_string_copy): Rewrite to avoid -Wstringop-overflow
warning in generated code.
(alloc_scalar_allocatable_for_subcomponent_assignment):
fold_convert to right type.
(gfc_trans_subcomponent_assign): Likewise.
(trans_class_vptr_len_assignment): Build const of correct type.
(gfc_trans_pointer_assignment): Likewise.
(alloc_scalar_allocatable_for_assignment): fold_convert to right
type in expr.
(trans_class_assignment): Build const of correct type.
* trans-intrinsic.c (gfc_conv_associated): Likewise.
(gfc_conv_intrinsic_repeat): Do calculation in sizetype.
* trans-io.c (gfc_build_io_library_fndecls): Use
gfc_charlen_type_node for character lengths.
* trans-stmt.c (gfc_trans_label_assign): Build const of
gfc_charlen_type_node.
(gfc_trans_character_select): Likewise.
(gfc_trans_allocate): Likewise, don't typecast strlen result.
(gfc_trans_deallocate): Don't typecast strlen result.
* trans-types.c (gfc_size_kind): New variable.
(gfc_init_types): 

Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-23 Thread Janne Blomqvist
On Wed, Dec 21, 2016 at 3:14 PM, Andre Vehreschild  wrote:
>> Now when I think about this some more, I have a vague recollection
>> that a long time ago it used to be something like that.  The problem
>> is that MIN_EXPR will of course be
>> NON-CONSTANT, so the memcpy call can't be inlined. Hence it was
>> changed to two separate __builtin_memmove() calls to have better
>> opportunity to inline. So probably a no-go to change it back. :(
>
> I don't get that. From the former only one of the memmove's could have been
> inlined assuming that only CONSTANT sizes are inlined.

Yes. Which is better than not being able to inline, assuming the
inlined branch is more likely to be taken, no?

> The other one had a
> NON-CONSTANT as long as pointer following and constant propagation was not
> effective together. In our case the "NON-CONSTANT branch" would have been 
> used,
> which is resolved by constant propagation (of the size of constant memory p
> points to). I assume the warning is triggered, because dead-code elimination
> has not removed the else part.

Probably yes, in this case. My performance worries were more about the
general case. But perhaps they are unfounded, IDK really. Does anybody
have a battery of string operation benchmarks that mirrors how real
Fortran code does string handling?

Anyway, the attached patch accomplishes that. It turns the tree dump I
showed two days ago into something like

  {
integer(kind=8) D.3484;
unsigned long D.3485;

D.3484 = *_p;
D.3485 = (unsigned long) D.3484 + 18446744073709551608;
if (D.3484 != 0)
  {
__builtin_memmove ((void *) *p, (void *) &"12345679"[1]{lb: 1
sz: 1}, MIN_EXPR <(unsigned long) D.3484, 8>);
if ((unsigned long) D.3484 > 8)
  {
__builtin_memset ((void *) *p + 8, 32, D.3485);
  }
  }
  }

and gets rid of the -Wstringop-overflow warning. (It causes a two new
testsuite failures (gfortran.dg/dependency_49.f90 and
gfortran.dg/transfer_intrinsic_1.f90), but those are just tests that
grep for patterns in the .original dump and need to be adjusted).

If that is Ok, do you prefer it as part of the charlen-size_t patch,
or would you (GFortran maintainers in general) prefer that it's a
separate patch?

-- 
Janne Blomqvist
diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 3767a28..6708ee03 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -6455,33 +6455,19 @@ gfc_trans_string_copy (stmtblock_t * block, tree 
dlength, tree dest,
   return;
 }
 
+  /* The string copy algorithm below generates code like
+
+ if (dlen > 0) {
+ memmove (dest, src, min(dlen, slen));
+ if (slen < dlen)
+ memset(&dest[slen], ' ', dlen - slen);
+ }
+  */
+
   /* Do nothing if the destination length is zero.  */
   cond = fold_build2_loc (input_location, GT_EXPR, boolean_type_node, dlen,
  build_int_cst (size_type_node, 0));
 
-  /* The following code was previously in _gfortran_copy_string:
-
-   // The two strings may overlap so we use memmove.
-   void
-   copy_string (GFC_INTEGER_4 destlen, char * dest,
-GFC_INTEGER_4 srclen, const char * src)
-   {
- if (srclen >= destlen)
-   {
- // This will truncate if too long.
- memmove (dest, src, destlen);
-   }
- else
-   {
- memmove (dest, src, srclen);
- // Pad with spaces.
- memset (&dest[srclen], ' ', destlen - srclen);
-   }
-   }
-
- We're now doing it here for better optimization, but the logic
- is the same.  */
-
   /* For non-default character kinds, we have to multiply the string
  length by the base type size.  */
   chartype = gfc_get_char_type (dkind);
@@ -6504,17 +6490,19 @@ gfc_trans_string_copy (stmtblock_t * block, tree 
dlength, tree dest,
   else
 src = gfc_build_addr_expr (pvoid_type_node, src);
 
-  /* Truncate string if source is too long.  */
-  cond2 = fold_build2_loc (input_location, GE_EXPR, boolean_type_node, slen,
-  dlen);
+  /* First do the memmove. */
+  tmp2 = fold_build2_loc (input_location, MIN_EXPR, TREE_TYPE (dlen), dlen,
+ slen);
   tmp2 = build_call_expr_loc (input_location,
  builtin_decl_explicit (BUILT_IN_MEMMOVE),
- 3, dest, src, dlen);
+ 3, dest, src, tmp2);
+  stmtblock_t tmpblock2;
+  gfc_init_block (&tmpblock2);
+  gfc_add_expr_to_block (&tmpblock2, tmp2);
 
-  /* Else copy and pad with spaces.  */
-  tmp3 = build_call_expr_loc (input_location,
- builtin_decl_explicit (BUILT_IN_MEMMOVE),
- 3, dest, src, slen);
+  /* If the destination is longer, fill the end with spaces.  */
+  cond2 = fold_build2_loc (input_location, LT_EXPR, boolean_type_node, slen,
+

Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-21 Thread Andre Vehreschild
> Now when I think about this some more, I have a vague recollection
> that a long time ago it used to be something like that.  The problem
> is that MIN_EXPR will of course be
> NON-CONSTANT, so the memcpy call can't be inlined. Hence it was
> changed to two separate __builtin_memmove() calls to have better
> opportunity to inline. So probably a no-go to change it back. :(

I don't get that. From the former only one of the memmove's could have been
inlined assuming that only CONSTANT sizes are inlined. The other one had a
NON-CONSTANT as long as pointer following and constant propagation was not
effective together. In our case the "NON-CONSTANT branch" would have been used,
which is resolved by constant propagation (of the size of constant memory p
points to). I assume the warning is triggered, because dead-code elimination
has not removed the else part. 

Following this thought the MIN_EXPR would be propagated to 5 and the inliner
can do its magic. Albeit it may be that now some other optimization level will
trigger a warning, because some part has not been removed/constant replaced.
What do you think of that?

-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-21 Thread Janne Blomqvist
On Wed, Dec 21, 2016 at 1:50 PM, Andre Vehreschild  wrote:
>> Here p is the character variable, and _p is the charlen. My guess is
>> that the problem is that with -O1 it sees that the second memmove
>> would overflow p, but it doesn't realize that branch is never taken.
>> Cranking up the optimization level to -O2 and beyond makes it realize
>> it, and thus the warning disappears.
>>
>> Perhaps one could rewrite that to something like
>>
>> __builtin_memmove ((void *) *p, (void *) &"12345679"[1]{lb: 1 sz: 1},
>> MIN_EXPR<(unsigned long) D.3598,8>);
>> if ((unsigned long) D.3598 > 8)
>>   {
>>   __builtin_memset ((void*) *p + 8, 32, D.3599);
>>   }
>
> That looks interesting. It assumes though, that D.3598 will *never* be
> negative. Because when it is negative 8 characters (cast to unsigned makes the
> negative number huge) will be copied, while in the former code memmove will
> reject the coping of a negative number of bytes. Therefore I propose to omit
> the cast in the MIN_EXPR and make the constant 8 signed, too. That should
> comply and mimick the former behavior more closely. What do you think? Who's
> going to try?

Now when I think about this some more, I have a vague recollection
that a long time ago it used to be something like that.  The problem
is that MIN_EXPR will of course be
NON-CONSTANT, so the memcpy call can't be inlined. Hence it was
changed to two separate __builtin_memmove() calls to have better
opportunity to inline. So probably a no-go to change it back. :(

-- 
Janne Blomqvist


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-21 Thread Andre Vehreschild
> Here p is the character variable, and _p is the charlen. My guess is
> that the problem is that with -O1 it sees that the second memmove
> would overflow p, but it doesn't realize that branch is never taken.
> Cranking up the optimization level to -O2 and beyond makes it realize
> it, and thus the warning disappears.
> 
> Perhaps one could rewrite that to something like
> 
> __builtin_memmove ((void *) *p, (void *) &"12345679"[1]{lb: 1 sz: 1},
> MIN_EXPR<(unsigned long) D.3598,8>);
> if ((unsigned long) D.3598 > 8)
>   {
>   __builtin_memset ((void*) *p + 8, 32, D.3599);
>   }

That looks interesting. It assumes though, that D.3598 will *never* be
negative. Because when it is negative 8 characters (cast to unsigned makes the
negative number huge) will be copied, while in the former code memmove will
reject the coping of a negative number of bytes. Therefore I propose to omit
the cast in the MIN_EXPR and make the constant 8 signed, too. That should
comply and mimick the former behavior more closely. What do you think? Who's
going to try?

-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-21 Thread Janne Blomqvist
On Wed, Dec 21, 2016 at 12:33 PM, Andre Vehreschild  wrote:
> Hi Janne,
>
>> But yes, I'm still seeing the warning messages with -O1 for
>> allocate_deferred_char_scalar_1.f03. AFAICT it's a bogus warning, but
>> I don't know how to get rid of it..
>
> No, that warning isn't at all bogus. The warning in fact is astonishingly
> precise. When I remember correctly, then the warning complains about trying to
> put a string of length 8 into memory of length 5. There is never a memory
> access error at runtime, because the code generated ensures that only 5 chars
> are copied, but I am impressed by the analysis done by some intermediate step
> of gcc. It figures, that memory is available for 5 characters only derefing a
> "static/constant" pointer and then learning that initially 8 chars are to be
> copied. I already tried to fix this by only generating code to copy the 5
> characters and make this knowledge available to the gimplifier, but I failed 
> to
> deref the pointer and get the information statically. So IMHO the warning is
> not bogus. It is absolutely correct and it is quite sophisticated to learn all
> the necessary facts, but I didn't find a way to get this done in the 
> front-end.
> We might be able to prevent the warning when there is a chance to add some 
> hook
> into the middle stages of the compiler, telling it, that everything is fine.
> But I have no idea what is possible and available there.

I suspect it's complaining about (from the -fdump-tree-original):

  {
integer(kind=8) D.3598;
unsigned long D.3599;

D.3598 = *_p;
D.3599 = (unsigned long) D.3598 + 18446744073709551608;
if (D.3598 != 0)
  {
if ((unsigned long) D.3598 <= 8)
  {
__builtin_memmove ((void *) *p, (void *)
&"12345679"[1]{lb: 1 sz: 1}, (unsigned long) D.3598);
  }
else
  {
__builtin_memmove ((void *) *p, (void *)
&"12345679"[1]{lb: 1 sz: 1}, 8);
__builtin_memset ((void *) *p + 8, 32, D.3599);
  }
  }
  }

Here p is the character variable, and _p is the charlen. My guess is
that the problem is that with -O1 it sees that the second memmove
would overflow p, but it doesn't realize that branch is never taken.
Cranking up the optimization level to -O2 and beyond makes it realize
it, and thus the warning disappears.

Perhaps one could rewrite that to something like

__builtin_memmove ((void *) *p, (void *) &"12345679"[1]{lb: 1 sz: 1},
MIN_EXPR<(unsigned long) D.3598,8>);
if ((unsigned long) D.3598 > 8)
  {
  __builtin_memset ((void*) *p + 8, 32, D.3599);
  }


?



-- 
Janne Blomqvist


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-21 Thread Andre Vehreschild
Hi Janne,

> But yes, I'm still seeing the warning messages with -O1 for
> allocate_deferred_char_scalar_1.f03. AFAICT it's a bogus warning, but
> I don't know how to get rid of it..

No, that warning isn't at all bogus. The warning in fact is astonishingly
precise. When I remember correctly, then the warning complains about trying to
put a string of length 8 into memory of length 5. There is never a memory
access error at runtime, because the code generated ensures that only 5 chars
are copied, but I am impressed by the analysis done by some intermediate step
of gcc. It figures, that memory is available for 5 characters only derefing a
"static/constant" pointer and then learning that initially 8 chars are to be
copied. I already tried to fix this by only generating code to copy the 5
characters and make this knowledge available to the gimplifier, but I failed to
deref the pointer and get the information statically. So IMHO the warning is
not bogus. It is absolutely correct and it is quite sophisticated to learn all
the necessary facts, but I didn't find a way to get this done in the front-end.
We might be able to prevent the warning when there is a chance to add some hook
into the middle stages of the compiler, telling it, that everything is fine.
But I have no idea what is possible and available there.

Regards,
Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-21 Thread Janne Blomqvist
On Wed, Dec 21, 2016 at 12:05 PM, Andre Vehreschild  wrote:
> Hi all,
>
> so I have learned that proposing to write "speaking code" is not very well
> taken.

If you want to make a patch introducing gfc_size_t_zero_node, go
ahead, at least I won't object.  I don't think
build_zero_cst(size_type_node) is that terrible myself, but I don't
have any hard opinions on this.

> Anyway, there is a patch (in two parts) hanging about changing the character
> length from int to size_t. It looks ok to me, but I do not have the privilege
> to ok it. Furthermore am I still not convinced that we can't do anything about
> the failing testcase allocate_deferred_char_scalar_1. So how do we proceed?

I have just verified that my fix for PR 78867 fixes the -flto failures
Dominique noticed. I have some other minor cleanup to do to the
charlen->size_t patch, and then I'll resubmit it.

But yes, I'm still seeing the warning messages with -O1 for
allocate_deferred_char_scalar_1.f03. AFAICT it's a bogus warning, but
I don't know how to get rid of it..


-- 
Janne Blomqvist


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-21 Thread Andre Vehreschild
Hi all,

so I have learned that proposing to write "speaking code" is not very well
taken.

Anyway, there is a patch (in two parts) hanging about changing the character
length from int to size_t. It looks ok to me, but I do not have the privilege
to ok it. Furthermore am I still not convinced that we can't do anything about
the failing testcase allocate_deferred_char_scalar_1. So how do we proceed?

- Andre

On Tue, 20 Dec 2016 17:08:54 +0100
Jakub Jelinek  wrote:

> On Tue, Dec 20, 2016 at 05:04:54PM +0100, Andre Vehreschild wrote:
> > Well, then how about:
> > 
> > #define gfc_size_t_zero_node build_int_cst (size_type_node, 0)
> > 
> > We can't get any faster and for new and old gfortran-hackers one
> > identifier's meaning is faster to grasp than two's.  
> 
> Such macros can cause various maintenance issues, so I'm not in favor of
> that.  But if you as fortran maintainers want it, I won't object strongly.
> 
>   Jakub


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-20 Thread Jakub Jelinek
On Tue, Dec 20, 2016 at 05:04:54PM +0100, Andre Vehreschild wrote:
> Well, then how about:
> 
> #define gfc_size_t_zero_node build_int_cst (size_type_node, 0)
> 
> We can't get any faster and for new and old gfortran-hackers one identifier's
> meaning is faster to grasp than two's.

Such macros can cause various maintenance issues, so I'm not in favor of
that.  But if you as fortran maintainers want it, I won't object strongly.

Jakub


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-20 Thread Andre Vehreschild


On Tue, 20 Dec 2016 16:40:13 +0100
Jakub Jelinek  wrote:

> On Tue, Dec 20, 2016 at 04:29:07PM +0100, Andre Vehreschild wrote:
> > > The first one is GCC internal type for representing sizes, the latter is
> > > the C size_t (usually they have the same precision, they always have the
> > > same signedness (unsigned)).
> > > In the past sizetype actually has been a signed type with very special
> > > behavior.  
> > 
> > I am still wondering if it does not make sense to have something like
> > gfc_size_t_zero_node to prevent us from repeating build_zero_cst
> > (size_type_node) all the time. I had to use it 16 times, i.e., 16 times the
> > code for building a zero size type node is generated instead of a reference
> > to a "constant". And I don't want to know how often size_zero_node is used
> > in the wrong location.  
> 
> built_int_cst (size_type_node, 0) is actually faster than build_zero_cst,
> one fewer level of indirection.
> The 0 constant is cached in the type itself, so it actually in the end
> is basically just:
> return TREE_VEC_ELT (TYPE_CACHED_VALUES (type), 1);
> 
> Adding a variable to hold gfc_size_t_zero_node would mean you'd need to add
> a GC root to hold it.
> 
> Note, sizetype should be still what is usually used, only if you in the ABI
> have something declared as C size_t, then size_type_node should be used.

Well, then how about:

#define gfc_size_t_zero_node build_int_cst (size_type_node, 0)

We can't get any faster and for new and old gfortran-hackers one identifier's
meaning is faster to grasp than two's.

- Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-20 Thread Jakub Jelinek
On Tue, Dec 20, 2016 at 04:29:07PM +0100, Andre Vehreschild wrote:
> > The first one is GCC internal type for representing sizes, the latter is
> > the C size_t (usually they have the same precision, they always have the
> > same signedness (unsigned)).
> > In the past sizetype actually has been a signed type with very special
> > behavior.
> 
> I am still wondering if it does not make sense to have something like
> gfc_size_t_zero_node to prevent us from repeating build_zero_cst
> (size_type_node) all the time. I had to use it 16 times, i.e., 16 times the
> code for building a zero size type node is generated instead of a reference to
> a "constant". And I don't want to know how often size_zero_node is used in the
> wrong location.

built_int_cst (size_type_node, 0) is actually faster than build_zero_cst,
one fewer level of indirection.
The 0 constant is cached in the type itself, so it actually in the end
is basically just:
return TREE_VEC_ELT (TYPE_CACHED_VALUES (type), 1);

Adding a variable to hold gfc_size_t_zero_node would mean you'd need to add
a GC root to hold it.

Note, sizetype should be still what is usually used, only if you in the ABI
have something declared as C size_t, then size_type_node should be used.

Jakub


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-20 Thread Andre Vehreschild


On Tue, 20 Dec 2016 16:00:19 +0100
Jakub Jelinek  wrote:

> On Tue, Dec 20, 2016 at 04:55:29PM +0200, Janne Blomqvist wrote:
> > On Tue, Dec 20, 2016 at 3:42 PM, Andre Vehreschild  wrote:  
> > > Hi all,
> > >  
> > >> I think you should use build_zero_cst(size_type_node) instead of
> > >> size_zero_node as size_zero_node is of type sizetype which is not the
> > >> same as size_type_node. Otherwise looks good.  
> > >
> > > In the software design classes I took this was called a design error: Not
> > > choosing sufficiently different names for different artifacts. It was
> > > considered a beginner's error.  
> > 
> > Yeah, sizetype vs. size_type_node is confusing, to say the least..  
> 
> The first one is GCC internal type for representing sizes, the latter is
> the C size_t (usually they have the same precision, they always have the
> same signedness (unsigned)).
> In the past sizetype actually has been a signed type with very special
> behavior.

I am still wondering if it does not make sense to have something like
gfc_size_t_zero_node to prevent us from repeating build_zero_cst
(size_type_node) all the time. I had to use it 16 times, i.e., 16 times the
code for building a zero size type node is generated instead of a reference to
a "constant". And I don't want to know how often size_zero_node is used in the
wrong location.

- Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-20 Thread Jakub Jelinek
On Tue, Dec 20, 2016 at 04:55:29PM +0200, Janne Blomqvist wrote:
> On Tue, Dec 20, 2016 at 3:42 PM, Andre Vehreschild  wrote:
> > Hi all,
> >
> >> I think you should use build_zero_cst(size_type_node) instead of
> >> size_zero_node as size_zero_node is of type sizetype which is not the
> >> same as size_type_node. Otherwise looks good.
> >
> > In the software design classes I took this was called a design error: Not
> > choosing sufficiently different names for different artifacts. It was
> > considered a beginner's error.
> 
> Yeah, sizetype vs. size_type_node is confusing, to say the least..

The first one is GCC internal type for representing sizes, the latter is
the C size_t (usually they have the same precision, they always have the
same signedness (unsigned)).
In the past sizetype actually has been a signed type with very special
behavior.

Jakub


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-20 Thread Janne Blomqvist
On Tue, Dec 20, 2016 at 3:42 PM, Andre Vehreschild  wrote:
> Hi all,
>
>> I think you should use build_zero_cst(size_type_node) instead of
>> size_zero_node as size_zero_node is of type sizetype which is not the
>> same as size_type_node. Otherwise looks good.
>
> In the software design classes I took this was called a design error: Not
> choosing sufficiently different names for different artifacts. It was
> considered a beginner's error.

Yeah, sizetype vs. size_type_node is confusing, to say the least..

> So now I have to repeat myself 16 times only to work around this b***. Nothing
> that will improve gfortran's maintainability.
>
> Second version of the changes needed for caf attached. Bootstrapped and
> regtested fine besides prior known
>
> FAIL: gfortran.dg/allocate_deferred_char_scalar_1.f03 -O1  (test for excess
> errors)
>
> on x86_64-linux/f23.

Ok, looks good.



-- 
Janne Blomqvist


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-20 Thread Andre Vehreschild
Hi all,

> I think you should use build_zero_cst(size_type_node) instead of
> size_zero_node as size_zero_node is of type sizetype which is not the
> same as size_type_node. Otherwise looks good.

In the software design classes I took this was called a design error: Not
choosing sufficiently different names for different artifacts. It was
considered a beginner's error.

So now I have to repeat myself 16 times only to work around this b***. Nothing
that will improve gfortran's maintainability.

Second version of the changes needed for caf attached. Bootstrapped and
regtested fine besides prior known

FAIL: gfortran.dg/allocate_deferred_char_scalar_1.f03 -O1  (test for excess
errors)

on x86_64-linux/f23.

- Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


pr78534_caf_v2.clog
Description: Binary data
diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 5b05a3d..1604bc8 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -4211,7 +4211,7 @@ size or one for a scalar.
 
 @item @emph{Syntax}:
 @code{void caf_register (size_t size, caf_register_t type, caf_token_t *token,
-gfc_descriptor_t *desc, int *stat, char *errmsg, int errmsg_len)}
+gfc_descriptor_t *desc, int *stat, char *errmsg, size_t errmsg_len)}
 
 @item @emph{Arguments}:
 @multitable @columnfractions .15 .70
@@ -4263,7 +4263,7 @@ library is only expected to free memory it allocated itself during a call to
 
 @item @emph{Syntax}:
 @code{void caf_deregister (caf_token_t *token, caf_deregister_t type,
-int *stat, char *errmsg, int errmsg_len)}
+int *stat, char *errmsg, size_t errmsg_len)}
 
 @item @emph{Arguments}:
 @multitable @columnfractions .15 .70
@@ -4322,7 +4322,8 @@ to a remote image identified by the image_index.
 @item @emph{Syntax}:
 @code{void _gfortran_caf_send (caf_token_t token, size_t offset,
 int image_index, gfc_descriptor_t *dest, caf_vector_t *dst_vector,
-gfc_descriptor_t *src, int dst_kind, int src_kind, bool may_require_tmp)}
+gfc_descriptor_t *src, int dst_kind, int src_kind, bool may_require_tmp,
+int *stat)}
 
 @item @emph{Arguments}:
 @multitable @columnfractions .15 .70
@@ -4345,6 +4346,9 @@ time that the @var{dest} and @var{src} either cannot overlap or overlap (fully
 or partially) such that walking @var{src} and @var{dest} in element wise
 element order (honoring the stride value) will not lead to wrong results.
 Otherwise, the value is true.
+@item @var{stat} @tab intent(out) when non-NULL give the result of the
+operation, i.e., zero on success and non-zero on error.  When NULL and error
+occurs, then an error message is printed and the program is terminated.
 @end multitable
 
 @item @emph{NOTES}
@@ -4375,7 +4379,8 @@ image identified by the image_index.
 @item @emph{Syntax}:
 @code{void _gfortran_caf_get (caf_token_t token, size_t offset,
 int image_index, gfc_descriptor_t *src, caf_vector_t *src_vector,
-gfc_descriptor_t *dest, int src_kind, int dst_kind, bool may_require_tmp)}
+gfc_descriptor_t *dest, int src_kind, int dst_kind, bool may_require_tmp,
+int *stat)}
 
 @item @emph{Arguments}:
 @multitable @columnfractions .15 .70
@@ -4398,6 +4403,9 @@ time that the @var{dest} and @var{src} either cannot overlap or overlap (fully
 or partially) such that walking @var{src} and @var{dest} in element wise
 element order (honoring the stride value) will not lead to wrong results.
 Otherwise, the value is true.
+@item @var{stat} @tab intent(out) when non-NULL give the result of the
+operation, i.e., zero on success and non-zero on error.  When NULL and error
+occurs, then an error message is printed and the program is terminated.
 @end multitable
 
 @item @emph{NOTES}
@@ -4430,7 +4438,7 @@ dst_image_index.
 int dst_image_index, gfc_descriptor_t *dest, caf_vector_t *dst_vector,
 caf_token_t src_token, size_t src_offset, int src_image_index,
 gfc_descriptor_t *src, caf_vector_t *src_vector, int dst_kind, int src_kind,
-bool may_require_tmp)}
+bool may_require_tmp, int *stat)}
 
 @item @emph{Arguments}:
 @multitable @columnfractions .15 .70
@@ -4461,6 +4469,9 @@ time that the @var{dest} and @var{src} either cannot overlap or overlap (fully
 or partially) such that walking @var{src} and @var{dest} in element wise
 element order (honoring the stride value) will not lead to wrong results.
 Otherwise, the value is true.
+@item @var{stat} @tab intent(out) when non-NULL give the result of the
+operation, i.e., zero on success and non-zero on error.  When NULL and error
+occurs, then an error message is printed and the program is terminated.
 @end multitable
 
 @item @emph{NOTES}
@@ -4673,7 +4684,7 @@ been locked by the same image is an error.
 
 @item @emph{Syntax}:
 @code{void _gfortran_caf_lock (caf_token_t token, size_t index, int image_index,
-int *aquired_lock, int *stat, char *errmsg, int errmsg_len)}
+int *aquired_lock, int *stat, char *errmsg, size_t errmsg_len)}
 
 @item @emph{Arguments}:
 @multitable @columnfractions .15 .70
@@ -4708,7 +4719,7 @@ which is unlocked or has bee

Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-20 Thread FX
Dear Bob,

First, regarding the ABI vs. API question: there is no consistent API for how 
to pass between Fortran and C strings, unless one uses Fortran 2003’s 
ISO_C_BINDING. It’s an ABI detail, in the sense that every compiler will choose 
to do things their own way: most compilers who pass a hidden length parameter, 
although its size (32-bit or 64-bit or size_t) and position (either after the 
char pointer, or at the end of the argument list) are variable between 
compilers. So, any code that does this is already compiler-specific.

Second, there are good reasons we might want to change this. One is possible 
use cases (although there are few, by definition, because we simply don’t 
support those right now). The second one is compatibility with C 
string-handling functions, who operate on size_t arguments, which means we can 
now use those functions without casting types around all the time.

Finally, if we’re making this change, we welcome any feedback on how to make it 
as easy as possible to handle in user code. Documentation, preprocessor macros, 
etc.

In particular, one of the things we will need to address is on helping widely 
used code to adapt to the change, so that. One example I am thinking of, that 
uses old-style C/Fortran interfaces, is MPI libraries (openmpi & mpich). We 
definitely need to test those to make sure nothing breaks if we are going to 
proceed — or they need to be fixed upstream well before we release, and with 
due note of the incompatibility in our release notes.


Cheers,
FX

Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-19 Thread Bob Deen

On 12/19/16 11:33 AM, Janne Blomqvist wrote:

On Mon, Dec 19, 2016 at 6:43 PM, Bob Deen  wrote:

Hi all...

I never saw any followup on this...?

It's one thing to break the ABI between the compiler and the gfortran
library; those can generally be expected to be in sync.  It's another to
break the ABI between two *languages*, when there might be no such
expectation (especially if gcc does NOT break their ABI at the same version
number transition).  Yes, the pre-ISO_C_BINDING method may be old-fashioned,
but it is a de-facto standard, and breaking it should not be done lightly.


First: No, it's not done "lightly". And secondly, cross-language
interfacing is always tricky, which is why most languages, including
Fortran with ISO_C_BINDING, have devised standardized ways for
communication with C so users don't need to rely on various cross-call
mechanisms that are not guaranteed to work.


Apologies if I offended (and to Steve too).  I see all the deliberation 
you're doing for breaking the language->library ABI, and appreciate 
that.  It's well-justified.  My point, however, is that with this change 
you are breaking an entirely *different* ABI - that between Fortran and 
C - and the sum total of discussion was one message from Janne pointing 
out that it was breaking (thanks for that heads-up, I had missed it!), 
with no followup.  Janne, you yourself in that message questioned the 
need for large strings, and had no use cases in response to FX's inquiry.


Now that I think about it, it's not even an ABI change, it's an API 
change... requiring a code change, not just a recompile.


So in this case, this change represents (AFAIK) the only breakage in the 
old-style Fortran<->C ABI/API, with no known use cases... and thus my 
question about whether it's justified.  It's a fair question.  I'm not 
arguing the language->library ABI at all.



C changed to use size_t for string lengths instead of int with ANSI C
in, what, 1989.  With 2-socket servers, typically used e.g. in HPC
clusters, today easily having hundreds of gigs of RAM, limiting
GFortran char lengths to 2 GB for all eternity in the name of
compatibility seems quaint at best. Maybe in your organization Fortran
is legacy code that YE SHALL NOT TOUCH, but GFortran also has to cater
to users who have chosen to write new code in Fortran.


I understand that.  It just seems that opening up an entirely *new* 
ABI/API for breakage deserved a little more discussion.  Y'all are the 
ones doing the (mostly volunteer) work on gfortran, and I appreciate it. 
 You're also much more invested in the future of the language than I 
(yeah, it's mostly legacy code for us).  If you end up deciding that it 
needs to be done, then I'll deal with it.  I just wanted to chime in 
that there are users who will be affected.  If I'm the only one, I 
wouldn't want to stand in the way of progress - but also don't want to 
get steamrolled if it's not an important change, or if there are other 
affected users.


So... ARE there any other affected users out there??


Oh, you have macros rather than hard-coded int all over the place?
Shouldn't it be a relatively trivial affair then to define that macro
appropriately depending on which compiler and which version you're
using?


I wouldn't call it trivial by any means... it's tricky code I haven't 
had to look at in 10 years.  But in the end, probably doable.



Steve showed how you can do it for Fortran. From the C side, just
check the version from the __GNUC__ macro.


I dislike having to check for version numbers (feels kludgy) but that's 
a personal preference.  That will probably work, with a bit of futzing.


Thanks for your attention...

-Bob

Bob Deen @ NASA-JPL Multimission Image Processing Lab
bob.d...@jpl.nasa.gov



Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-19 Thread Janne Blomqvist
On Mon, Dec 19, 2016 at 6:43 PM, Bob Deen  wrote:
> Hi all...
>
> I never saw any followup on this...?
>
> It's one thing to break the ABI between the compiler and the gfortran
> library; those can generally be expected to be in sync.  It's another to
> break the ABI between two *languages*, when there might be no such
> expectation (especially if gcc does NOT break their ABI at the same version
> number transition).  Yes, the pre-ISO_C_BINDING method may be old-fashioned,
> but it is a de-facto standard, and breaking it should not be done lightly.

First: No, it's not done "lightly". And secondly, cross-language
interfacing is always tricky, which is why most languages, including
Fortran with ISO_C_BINDING, have devised standardized ways for
communication with C so users don't need to rely on various cross-call
mechanisms that are not guaranteed to work.

That the charlen is a hidden argument added at the end of type int is
AFAIK a fairly common implementation choice, though I'm not sure if it
can be called a de-facto standard.  Considering that Intel Fortran has
switched to size_t several years ago (5-ish?), and AFAIU it's the most
used Fortran compiler around in addition to GFortran, and the world
hasn't crashed down due to it, I suspect users can adapt to the change
with relatively little drama.

That gcc would change it's ABI at all, and especially in conjunction
with gfortran, is a pipe dream.

C changed to use size_t for string lengths instead of int with ANSI C
in, what, 1989.  With 2-socket servers, typically used e.g. in HPC
clusters, today easily having hundreds of gigs of RAM, limiting
GFortran char lengths to 2 GB for all eternity in the name of
compatibility seems quaint at best. Maybe in your organization Fortran
is legacy code that YE SHALL NOT TOUCH, but GFortran also has to cater
to users who have chosen to write new code in Fortran.

> If you do proceed with changing the size, I would request that there at
> least be a facility to reliably tell at compile time (on the C side) which
> definition is being used, so I can adjust our macros accordingly.

Oh, you have macros rather than hard-coded int all over the place?
Shouldn't it be a relatively trivial affair then to define that macro
appropriately depending on which compiler and which version you're
using?

Steve showed how you can do it for Fortran. From the C side, just
check the version from the __GNUC__ macro.

> Our code
> does depend on the size, and it has to cross-platform (and now, if this
> change is made, cross-version), so with this change I would have to support
> both int and size_t.

Well, if you add the option to use size_t you should be able to use
ifort as well. :)

> A C-side preprocessor symbol definition would do the trick.  Of course that
> assumes the versions of gcc/g++ and gfortran are in sync, which is never
> guaranteed.  But that assumption is better than nothing.  Unless someone has
> a better idea...?

Yeah, I think that's the best idea.

Another option would be to implement some kind of
-fcharacter-length=[int,size_t] command-line option. But that would
make the patch a lot more complicated since one would need to typecast
the character length argument when calling libgfortran. And, you'd
still have to have some version-dependent checks to see if gfortran
would accept that option. And like other similar options like
-fdefault-this-or-that it would change the ABI, so code compiled with
that option would be incompatible with code compiled without it. So in
the end I'm not convinced such an option would actually make life any
easier for our users.

> Perhaps it might be best to wait until a time when gcc is also breaking
> their ABI, so that there's no question of code (on either side) working
> across the transition...?

AFAIK there is no ABI change planned for gcc. For better or worse, the
C language is relatively stable and doesn't change much.

> P.S.  I'm just a lurker here, but I lurk specifically to look for things
> that will break our code base, like this  ;-)

Well, then you ought to be aware the ABI cleanup page on the wiki,
where the char length issue has been listed for, what, 5 years or so,
so it can't really be a surprise that it will happen at some point,
can it...?

>
> Bob.Deen @ NASA-JPL Multimission Image Processing Lab
> bob.d...@jpl.nasa.gov
>
>
>
> On 12/12/16 10:26 AM, Bob Deen wrote:
>>
>>
>>> However, this will also affect people doing C->Fortran calls the
>>> old-fashioned way without ISO_C_BINDING, as they will have to change
>>> the string length argument from int to size_t in their prototypes.
>>> Then again, Intel Fortran did this some years ago so I guess at least
>>> people who care about portability to several compilers are aware.
>>
>>
>> We do a ton of this (old fashioned c-fortran binding) and changing the
>> string length argument size will have a big impact on us.  We don't use the
>> Intel compiler so we never noticed a change there.
>>
>> Is there really a use case 

Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-19 Thread Steve Kargl
On Mon, Dec 19, 2016 at 08:43:01AM -0800, Bob Deen wrote:
> 
> It's one thing to break the ABI between the compiler and the gfortran 
> library; those can generally be expected to be in sync.  It's another to 
> break the ABI between two *languages*, when there might be no such 
> expectation (especially if gcc does NOT break their ABI at the same 
> version number transition).  Yes, the pre-ISO_C_BINDING method may be 
> old-fashioned, but it is a de-facto standard, and breaking it should not 
> be done lightly.

Do you really think that those of us who actively contribute to 
gfortran development take breaking the ABI lightly?  We have put 
off changes to gfortran's library for several years to specifically 
avoid ABI breakage.  It seems that there is never a "Good Time" to
break the ABI.  However, in this case, support for F2008 9.6.4.8,
Defined Input/Output, necessitates a change in the ABI.  Instead of
breaking the ABI multiple times, it has been decided to try to cleanup
some long standing issues with libgfortran.

> If you do proceed with changing the size, I would request that there at 
> least be a facility to reliably tell at compile time (on the C side) 
> which definition is being used, so I can adjust our macros accordingly. 
> Our code does depend on the size, and it has to cross-platform (and now, 
> if this change is made, cross-version), so with this change I would have 
> to support both int and size_t.

As the breakage is going to occur with gfortran 7.0, you do

% cat a.F90
#if defined(__GFORTRAN__) && (__GNUC__ > 6)
print *, '7'
#else
print *, 'not 7'
#endif
end
% gfc7 -E a.F90 | cat -s
] gfc7 -E a.F90 | cat -s
# 1 "a.F90"
# 1 ""
# 1 ""
# 1 "a.F90"

print *, '7'

end
% gfortran6 -E a.F90 | cat -s
# 1 "a.F90"
# 1 ""
# 1 ""
# 1 "a.F90"

print *, 'not 7'

end

> Perhaps it might be best to wait until a time when gcc is also breaking 
> their ABI, so that there's no question of code (on either side) working 
> across the transition...?

There is never a good time.  If we are to wait for gcc, should
we remove support for Defined Input/Output from the compiler?

-- 
Steve


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-19 Thread Bob Deen

Hi all...

I never saw any followup on this...?

It's one thing to break the ABI between the compiler and the gfortran 
library; those can generally be expected to be in sync.  It's another to 
break the ABI between two *languages*, when there might be no such 
expectation (especially if gcc does NOT break their ABI at the same 
version number transition).  Yes, the pre-ISO_C_BINDING method may be 
old-fashioned, but it is a de-facto standard, and breaking it should not 
be done lightly.


If you do proceed with changing the size, I would request that there at 
least be a facility to reliably tell at compile time (on the C side) 
which definition is being used, so I can adjust our macros accordingly. 
Our code does depend on the size, and it has to cross-platform (and now, 
if this change is made, cross-version), so with this change I would have 
to support both int and size_t.


A C-side preprocessor symbol definition would do the trick.  Of course 
that assumes the versions of gcc/g++ and gfortran are in sync, which is 
never guaranteed.  But that assumption is better than nothing.  Unless 
someone has a better idea...?


Perhaps it might be best to wait until a time when gcc is also breaking 
their ABI, so that there's no question of code (on either side) working 
across the transition...?


Thanks...

-Bob

P.S.  I'm just a lurker here, but I lurk specifically to look for things 
that will break our code base, like this  ;-)


Bob.Deen @ NASA-JPL Multimission Image Processing Lab
bob.d...@jpl.nasa.gov


On 12/12/16 10:26 AM, Bob Deen wrote:



However, this will also affect people doing C->Fortran calls the
old-fashioned way without ISO_C_BINDING, as they will have to change
the string length argument from int to size_t in their prototypes.
Then again, Intel Fortran did this some years ago so I guess at least
people who care about portability to several compilers are aware.


We do a ton of this (old fashioned c-fortran binding) and changing the string 
length argument size will have a big impact on us.  We don't use the Intel 
compiler so we never noticed a change there.

Is there really a use case for strings > 2 GB that justifies the breakage?  I certainly 
understand wanting to do it "right" but I'm probably not the only one with 
practical considerations that argue against it if there are no compelling use cases.

Thanks...

-Bob

Bob Deen @ NASA-JPL Multimission Image Processing Lab
bob.d...@jpl.nasa.gov






Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-16 Thread Dominique d'Humières

> Le 16 déc. 2016 à 19:06, Janne Blomqvist  a écrit :
> 
> On Fri, Dec 16, 2016 at 5:50 PM, Dominique d'Humières
>  wrote:
>> Hi Janne,
>> 
>> I have applied your two patches and found that I had to skip the patches for 
>> resolve.c and match.c due to the errors
>> 
>> ../../p_work/gcc/fortran/resolve.c: In function 'void 
>> resolve_select_type(gfc_code*, gfc_namespace*)':
>> ../../p_work/gcc/fortran/resolve.c:8731:58: error: format '%lld' expects 
>> argument of type 'long long int', but argument 4 has type 'long int' 
>> [-Werror=format=]
>>  gfc_basic_typename (c->ts.type), charlen, c->ts.kind);
>> 
>> and
>> 
>> ../../p_work/gcc/fortran/match.c: In function 'gfc_symtree* 
>> select_intrinsic_set_tmp(gfc_typespec*)':
>> ../../p_work/gcc/fortran/match.c:5786:55: error: format '%lld' expects 
>> argument of type 'long long int', but argument 4 has type 'long int' 
>> [-Werror=format=]
>>   gfc_basic_typename (ts->type), charlen, ts->kind);
>> 
> 
> Oh, blast. Of course you're right, from the second patch, the part
> that touched resolve.c and match.c are incorrect.
> 
>> while the patch for dump-parse-tree.c was needed.
>> 
>> With the patches applied I see several failures in the test suite compiled 
>> with ‘-m64 -flto’, but not with ‘-m32 -flto’ of the kind
> 
> Ok, thanks for the hint. I haven't tested with -flto.  Do you just run
> it with something like
> 
> CFLAGS="-flto" make -j $NCPUS check-fortran
> 
> or how do you do it?

Due to recurrent problems with -flto, I have the following change in my working 
tree

--- ../_clean/gcc/testsuite/lib/gfortran-dg.exp 2016-02-12 20:21:04.0 
+0100
+++ gcc/testsuite/lib/gfortran-dg.exp   2016-02-13 10:03:11.0 +0100
@@ -140,8 +140,16 @@ proc gfortran-dg-runtest { testcases fla
# we cycle through the option list, otherwise we don't
if [expr [search_for $test "dg-do run"]] {
set option_list $torture_with_loops
+   if [check_effective_target_lto] {
+   lappend option_list { -g -flto }
+   # lappend option_list { -g -O3 -fwhole-program -flto }
+   }
} else {
set option_list [list { -O } ]
+   # if [check_effective_target_lto] {
+   # lappend option_list { -g -flto }
+   # lappend option_list { -g -O3 -fwhole-program -flto }
+   # }
}
 
set nshort [file tail [file dirname $test]]/[file tail $test]

i.e., I add the options ‘-g -flto’ to the list of options used by the gfortran 
test suite.

> 
> I don't have lldb, but I guess I can see roughly the same with gdb..

For what I do, it is close to gdb (I did not succeed to authorize it on recent 
darwin).

Dominique

>> The affected tests are
>> 
>> array_constructor_17.f90
>> auto_char_len_3.f90
>> char_length_14.f90
>> char_length_5.f90
>> char_result_*.f90
>> charlen_03.f90
>> deferred_type_param_4.f90
>> dummy_procedure_3.f90
>> mapping_[12].f90
>> module_read_[12].f90
>> parens_5.f90
>> pr57910.f90
>> proc_ptr_comp_16.f90
>> result_in_spec_2.f90
>> spec_expr_7.f90
>> string_length_1.f90
>> transfer_intrinsic_3.f90
>> widechar_6.f90
>> zero_length_1.f90
>> 
>> Note that I did not run lldb on all of them, thus I cannot guarantee that 
>> all fail along the same pattern.
>> 
>> Cheers,
>> 
>> Dominique
>> 
> 
> 
> 
> -- 
> Janne Blomqvist



Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-16 Thread Janne Blomqvist
On Fri, Dec 16, 2016 at 5:50 PM, Dominique d'Humières
 wrote:
> Hi Janne,
>
> I have applied your two patches and found that I had to skip the patches for 
> resolve.c and match.c due to the errors
>
> ../../p_work/gcc/fortran/resolve.c: In function 'void 
> resolve_select_type(gfc_code*, gfc_namespace*)':
> ../../p_work/gcc/fortran/resolve.c:8731:58: error: format '%lld' expects 
> argument of type 'long long int', but argument 4 has type 'long int' 
> [-Werror=format=]
>   gfc_basic_typename (c->ts.type), charlen, c->ts.kind);
>
> and
>
> ../../p_work/gcc/fortran/match.c: In function 'gfc_symtree* 
> select_intrinsic_set_tmp(gfc_typespec*)':
> ../../p_work/gcc/fortran/match.c:5786:55: error: format '%lld' expects 
> argument of type 'long long int', but argument 4 has type 'long int' 
> [-Werror=format=]
>gfc_basic_typename (ts->type), charlen, ts->kind);
>

Oh, blast. Of course you're right, from the second patch, the part
that touched resolve.c and match.c are incorrect.

> while the patch for dump-parse-tree.c was needed.
>
> With the patches applied I see several failures in the test suite compiled 
> with ‘-m64 -flto’, but not with ‘-m32 -flto’ of the kind

Ok, thanks for the hint. I haven't tested with -flto.  Do you just run
it with something like

CFLAGS="-flto" make -j $NCPUS check-fortran

or how do you do it?

I don't have lldb, but I guess I can see roughly the same with gdb..

>
> [Book15] f90/bug% gfc 
> /opt/gcc/work/gcc/testsuite/gfortran.dg/char_result_1.f90 -flto -c
> [Book15] f90/bug% lldb 
> /opt/gcc/gcc7w/libexec/gcc/x86_64-apple-darwin16.1.0/7.0.0/lto1
> (lldb) target create 
> "/opt/gcc/gcc7w/libexec/gcc/x86_64-apple-darwin16.1.0/7.0.0/lto1"
> Current executable set to 
> '/opt/gcc/gcc7w/libexec/gcc/x86_64-apple-darwin16.1.0/7.0.0/lto1' (x86_64).
> (lldb) run char_result_1.o
> Process 1310 launched: 
> '/opt/gcc/gcc7w/libexec/gcc/x86_64-apple-darwin16.1.0/7.0.0/lto1' (x86_64)
> Reading object files: char_result_1.o {GC start 2195k}
> Reading the callgraph
> Merging declarations
> Reading summaries
> Reading function bodies:
> Performing interprocedural optimizations
>
>   Assembling functions:
> f2 f1 double test 
> MAIN__Process 1310 stopped
> * thread #1: tid = 0x54ac, 0x0001008f76cf 
> lto1`lto_input_tree_ref(ib=, data_in=0x000142621090, 
> fn=0x, tag=) + 127 at lto-streamer-in.c:332, 
> queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, 
> address=0x18)
> frame #0: 0x0001008f76cf lto1`lto_input_tree_ref(ib=, 
> data_in=0x000142621090, fn=0x, tag=) + 127 
> at lto-streamer-in.c:332
>329
>330  case LTO_ssa_name_ref:
>331ix_u = streamer_read_uhwi (ib);
> -> 332result = (*SSANAMES (fn))[ix_u];
>333break;
>334
>335  case LTO_field_decl_ref:
> (lldb) bt
> * thread #1: tid = 0x54ac, 0x0001008f76cf 
> lto1`lto_input_tree_ref(ib=, data_in=0x000142621090, 
> fn=0x, tag=) + 127 at lto-streamer-in.c:332, 
> queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, 
> address=0x18)
>   * frame #0: 0x0001008f76cf lto1`lto_input_tree_ref(ib=, 
> data_in=0x000142621090, fn=0x, tag=) + 127 
> at lto-streamer-in.c:332
> frame #1: 0x0001008f78f1 lto1`lto_input_tree_1(ib=0x7fff5fbfeeb0, 
> data_in=0x000142621090, tag=, hash=) + 177 at 
> lto-streamer-in.c:1446
> frame #2: 0x0001008f7cb8 lto1`lto_input_tree(ib=0x7fff5fbfeeb0, 
> data_in=0x000142621090) + 88 at lto-streamer-in.c:1492
> frame #3: 0x000100ce8b43 
> lto1`streamer_read_tree_body(lto_input_block*, data_in*, tree_node*) + 51 at 
> tree-streamer-in.c:893
> frame #4: 0x000100ce8b10 
> lto1`streamer_read_tree_body(ib=0x7fff5fbfeeb0, 
> data_in=0x000142621090, expr=0x000143944af0) + 2096
> frame #5: 0x0001008f7110 
> lto1`::lto_read_tree_1(ib=0x7fff5fbfeeb0, data_in=0x000142621090, 
> expr=0x000143944af0) + 32 at lto-streamer-in.c:1333
> frame #6: 0x0001008f78b9 lto1`lto_input_tree_1(lto_input_block*, 
> data_in*, LTO_tags, unsigned int) + 38 at lto-streamer-in.c:1363
> frame #7: 0x0001008f7893 lto1`lto_input_tree_1(lto_input_block*, 
> data_in*, LTO_tags, unsigned int) + 20
> frame #8: 0x0001008f787f lto1`lto_input_tree_1(ib=0x7fff5fbfeeb0, 
> data_in=0x000142621090, tag=, hash=2871463685) + 63
> frame #9: 0x0001008f7c03 lto1`lto_input_scc(ib=0x7fff5fbfeeb0, 
> data_in=0x000142621090, len=0x7fff5fbfede8, 
> entry_len=0x7fff5fbfedec) + 371 at lto-streamer-in.c:1387
> frame #10: 0x0001008f7c91 lto1`lto_input_tree(ib=0x7fff5fbfeeb0, 
> data_in=0x000142621090) + 49 at lto-streamer-in.c:1490
> frame #11: 0x0001008f8d93 
> lto1`::lto_read_body_or_constructor(file_data=0x000143721000, 
> data=, node=, 
> section_type=LTO_section_function_body) + 931 at lto-streamer-in.c:1045
>  

Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-16 Thread Janne Blomqvist
On Fri, Dec 16, 2016 at 5:34 PM, Andre Vehreschild  wrote:
> Hi Janne, hi all,
>
> as promised please find attached the change from int32 to size_t for the
> caf-libraries. Because the caf-libraries do not require special notions
> indicated by negative values I went for using size_t there. I assume this will
> be easier to keep in sync for all caf-libraries, because the size_t is
> available on all modern platforms. I also took the liberty to fix the
> specifiers in trans-decl for the caf-function-declarations and update the
> documentation on the caf-functions in gfortran.texi where some parameters 
> where
> missing.
>
> These additional changes bootstrap fine and induce no new regressions on
> x86_64-linux/f23.

I think you should use build_zero_cst(size_type_node) instead of
size_zero_node as size_zero_node is of type sizetype which is not the
same as size_type_node. Otherwise looks good.

And yes, I think it makes sense to use size_t directly instead of
introducing the GFortran specific gfc_charlen_type typedef.

> I am still not sure, whether we shouldn't address the regression in
> allocate_deferred_char_scalar_1. I did some research, but could not yet come 
> to
> a practical solution. The backend somehow deduces that the memory pointed to
> has size 5 only. But I haven't found a way to do this in the front end.
>
> Comments on these changes?
>
> - Andre
>
> On Tue, 13 Dec 2016 21:08:51 +0200
> Janne Blomqvist  wrote:
>
>> On Mon, Dec 12, 2016 at 9:39 PM, Janne Blomqvist
>>  wrote:
>> > On Mon, Dec 12, 2016 at 7:39 PM, Andre Vehreschild  wrote:
>> >> Hi Janne,
>> >>
>> >> I found that you are favoring build_int_cst (size_type_node, 0) over
>> >> size_zero_node. Is there a reason to this?
>> >
>> > Yes. AFAIU size_zero_node is a zero constant for sizetype which is not
>> > the same as size_type_node.
>> >
>> > AFAIK the difference is that size_type_node is the C size_t type,
>> > whereas sizetype is a GCC internal type used for address expressions.
>> > On a "normal" target I understand that they are the same size, but
>> > there are some slight differences in semantics, e.g. size_type_node
>> > like C unsigned integer types is defined to wrap around on overflow
>> > whereas sizetype is undefined on overflow.
>> >
>> > I don't know if GCC supports some strange targets with some kind of
>> > segmented memory where the size of sizetype would be different from
>> > size_type_node, but I guess it's possible in theory at least.
>> >
>> > That being said, now that you mention in I should be using
>> > build_zero_cst (some_type_node) instead of
>> > build_int_cst(some_type_node, 0). There's also build_one_cst that I
>> > should use.
>> >
>> >> Furthermore did I have to patch this:
>> >>
>> >> diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c
>> >> index 585f25d..f374558 100644
>> >> --- a/gcc/fortran/dump-parse-tree.c
>> >> +++ b/gcc/fortran/dump-parse-tree.c
>> >> @@ -465,7 +465,7 @@ show_expr (gfc_expr *p)
>> >>   break;
>> >>
>> >> case BT_HOLLERITH:
>> >> - fprintf (dumpfile, "%dH", p->representation.length);
>> >> + fprintf (dumpfile, "%zdH", p->representation.length);
>> >>   c = p->representation.string;
>> >>   for (i = 0; i < p->representation.length; i++, c++)
>> >> {
>> >>
>> >> to bootstrap on x86_64-linux/f23.
>> >
>> > Ah, thanks for the catch. I'll fix it by using HOST_WIDE_INT_PRINT_DEC
>> > since I'll have to change gfc_charlen_t to be a typedef form
>> > HOST_WIDE_INT (see my answer to FX).
>> >
>> >> And I have this regression:
>> >>
>> >> FAIL: gfortran.dg/allocate_deferred_char_scalar_1.f03   -O1  (test for
>> >> excess errors)
>> >>
>> >> allocate_deferred_char_scalar_1.f03:184:0:
>> >>
>> >>  p = '12345679'
>> >>
>> >> Warning: '__builtin_memcpy' writing 8 bytes into a region of size 5
>> >> overflows the destination [-Wstringop-overflow=]
>> >>  allocate_deferred_char_scalar_1.f03:242:0:
>> >>
>> >>  p = 4_'12345679'
>> >>
>> >> Warning: '__builtin_memcpy' writing 32 bytes into a region of size 20
>> >> overflows the destination [-Wstringop-overflow=]
>> >
>> > I'm seeing that too, but I assumed they would be fixed by Paul's
>> > recent patch which I don't yet have in my tree yet due to the git
>> > mirror being stuck..
>> >
>> >> Btw, the patch for changing the ABI of the coarray-libs is already nearly
>> >> done. I just need to figure that what the state of regressions is with and
>> >> without my change.
>> >
>> > Thanks.
>> >
>> > I'll produce an updated patch with the changes discussed so far.
>> >
>> >
>> > --
>> > Janne Blomqvist
>>
>> Hi,
>>
>> attached is the updated patch that applies on top of the original. I
>> didn't do the charlen_zero_node etc, I just fixed the relatively few
>> places in my previous patch rather than everywhere in the entire
>> frontend.
>>
>> Now that the git mirror is working again, I see though those warnings
>> with -O1 from gfortran.dg/allocate_defe

Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-16 Thread Dominique d'Humières
Hi Janne,

I have applied your two patches and found that I had to skip the patches for 
resolve.c and match.c due to the errors

../../p_work/gcc/fortran/resolve.c: In function 'void 
resolve_select_type(gfc_code*, gfc_namespace*)':
../../p_work/gcc/fortran/resolve.c:8731:58: error: format '%lld' expects 
argument of type 'long long int', but argument 4 has type 'long int' 
[-Werror=format=]
  gfc_basic_typename (c->ts.type), charlen, c->ts.kind);

and

../../p_work/gcc/fortran/match.c: In function 'gfc_symtree* 
select_intrinsic_set_tmp(gfc_typespec*)':
../../p_work/gcc/fortran/match.c:5786:55: error: format '%lld' expects argument 
of type 'long long int', but argument 4 has type 'long int' [-Werror=format=]
   gfc_basic_typename (ts->type), charlen, ts->kind);

while the patch for dump-parse-tree.c was needed.

With the patches applied I see several failures in the test suite compiled with 
‘-m64 -flto’, but not with ‘-m32 -flto’ of the kind

[Book15] f90/bug% gfc /opt/gcc/work/gcc/testsuite/gfortran.dg/char_result_1.f90 
-flto -c
[Book15] f90/bug% lldb 
/opt/gcc/gcc7w/libexec/gcc/x86_64-apple-darwin16.1.0/7.0.0/lto1
(lldb) target create 
"/opt/gcc/gcc7w/libexec/gcc/x86_64-apple-darwin16.1.0/7.0.0/lto1"
Current executable set to 
'/opt/gcc/gcc7w/libexec/gcc/x86_64-apple-darwin16.1.0/7.0.0/lto1' (x86_64).
(lldb) run char_result_1.o
Process 1310 launched: 
'/opt/gcc/gcc7w/libexec/gcc/x86_64-apple-darwin16.1.0/7.0.0/lto1' (x86_64)
Reading object files: char_result_1.o {GC start 2195k} 
Reading the callgraph
Merging declarations
Reading summaries
Reading function bodies:
Performing interprocedural optimizations

 Assembling functions:
f2 f1 double test 
MAIN__Process 1310 stopped
* thread #1: tid = 0x54ac, 0x0001008f76cf 
lto1`lto_input_tree_ref(ib=, data_in=0x000142621090, 
fn=0x, tag=) + 127 at lto-streamer-in.c:332, queue 
= 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x18)
frame #0: 0x0001008f76cf lto1`lto_input_tree_ref(ib=, 
data_in=0x000142621090, fn=0x, tag=) + 127 at 
lto-streamer-in.c:332
   329  
   330  case LTO_ssa_name_ref:
   331ix_u = streamer_read_uhwi (ib);
-> 332result = (*SSANAMES (fn))[ix_u];
   333break;
   334  
   335  case LTO_field_decl_ref:
(lldb) bt
* thread #1: tid = 0x54ac, 0x0001008f76cf 
lto1`lto_input_tree_ref(ib=, data_in=0x000142621090, 
fn=0x, tag=) + 127 at lto-streamer-in.c:332, queue 
= 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x18)
  * frame #0: 0x0001008f76cf lto1`lto_input_tree_ref(ib=, 
data_in=0x000142621090, fn=0x, tag=) + 127 at 
lto-streamer-in.c:332
frame #1: 0x0001008f78f1 lto1`lto_input_tree_1(ib=0x7fff5fbfeeb0, 
data_in=0x000142621090, tag=, hash=) + 177 at 
lto-streamer-in.c:1446
frame #2: 0x0001008f7cb8 lto1`lto_input_tree(ib=0x7fff5fbfeeb0, 
data_in=0x000142621090) + 88 at lto-streamer-in.c:1492
frame #3: 0x000100ce8b43 lto1`streamer_read_tree_body(lto_input_block*, 
data_in*, tree_node*) + 51 at tree-streamer-in.c:893
frame #4: 0x000100ce8b10 
lto1`streamer_read_tree_body(ib=0x7fff5fbfeeb0, data_in=0x000142621090, 
expr=0x000143944af0) + 2096
frame #5: 0x0001008f7110 lto1`::lto_read_tree_1(ib=0x7fff5fbfeeb0, 
data_in=0x000142621090, expr=0x000143944af0) + 32 at 
lto-streamer-in.c:1333
frame #6: 0x0001008f78b9 lto1`lto_input_tree_1(lto_input_block*, 
data_in*, LTO_tags, unsigned int) + 38 at lto-streamer-in.c:1363
frame #7: 0x0001008f7893 lto1`lto_input_tree_1(lto_input_block*, 
data_in*, LTO_tags, unsigned int) + 20
frame #8: 0x0001008f787f lto1`lto_input_tree_1(ib=0x7fff5fbfeeb0, 
data_in=0x000142621090, tag=, hash=2871463685) + 63
frame #9: 0x0001008f7c03 lto1`lto_input_scc(ib=0x7fff5fbfeeb0, 
data_in=0x000142621090, len=0x7fff5fbfede8, 
entry_len=0x7fff5fbfedec) + 371 at lto-streamer-in.c:1387
frame #10: 0x0001008f7c91 lto1`lto_input_tree(ib=0x7fff5fbfeeb0, 
data_in=0x000142621090) + 49 at lto-streamer-in.c:1490
frame #11: 0x0001008f8d93 
lto1`::lto_read_body_or_constructor(file_data=0x000143721000, 
data=, node=, section_type=LTO_section_function_body) 
+ 931 at lto-streamer-in.c:1045
frame #12: 0x00010051a136 
lto1`cgraph_node::get_untransformed_body(this=0x00014391ee60) + 278 at 
cgraph.c:3581
frame #13: 0x000100526d9a 
lto1`cgraph_node::expand(this=0x00014391ee60) + 74 at cgraphunit.c:1971
frame #14: 0x000100527f00 
lto1`::output_in_order(no_reorder=) + 528 at cgraphunit.c:2244
frame #15: 0x000100528528 
lto1`symbol_table::compile(this=0x000143526100) + 952 at cgraphunit.c:2488
frame #16: 0x000100034c68 lto1`lto_main() + 7112 at lto.c:3330
frame #17: 0x000100a87f1a lto1`::compile_file() + 58 at toplev.c:463
frame #18

Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-16 Thread Andre Vehreschild
Hi Janne, hi all,

as promised please find attached the change from int32 to size_t for the
caf-libraries. Because the caf-libraries do not require special notions
indicated by negative values I went for using size_t there. I assume this will
be easier to keep in sync for all caf-libraries, because the size_t is
available on all modern platforms. I also took the liberty to fix the
specifiers in trans-decl for the caf-function-declarations and update the
documentation on the caf-functions in gfortran.texi where some parameters where
missing.

These additional changes bootstrap fine and induce no new regressions on
x86_64-linux/f23.

I am still not sure, whether we shouldn't address the regression in
allocate_deferred_char_scalar_1. I did some research, but could not yet come to
a practical solution. The backend somehow deduces that the memory pointed to
has size 5 only. But I haven't found a way to do this in the front end.

Comments on these changes?

- Andre

On Tue, 13 Dec 2016 21:08:51 +0200
Janne Blomqvist  wrote:

> On Mon, Dec 12, 2016 at 9:39 PM, Janne Blomqvist
>  wrote:
> > On Mon, Dec 12, 2016 at 7:39 PM, Andre Vehreschild  wrote:  
> >> Hi Janne,
> >>
> >> I found that you are favoring build_int_cst (size_type_node, 0) over
> >> size_zero_node. Is there a reason to this?  
> >
> > Yes. AFAIU size_zero_node is a zero constant for sizetype which is not
> > the same as size_type_node.
> >
> > AFAIK the difference is that size_type_node is the C size_t type,
> > whereas sizetype is a GCC internal type used for address expressions.
> > On a "normal" target I understand that they are the same size, but
> > there are some slight differences in semantics, e.g. size_type_node
> > like C unsigned integer types is defined to wrap around on overflow
> > whereas sizetype is undefined on overflow.
> >
> > I don't know if GCC supports some strange targets with some kind of
> > segmented memory where the size of sizetype would be different from
> > size_type_node, but I guess it's possible in theory at least.
> >
> > That being said, now that you mention in I should be using
> > build_zero_cst (some_type_node) instead of
> > build_int_cst(some_type_node, 0). There's also build_one_cst that I
> > should use.
> >  
> >> Furthermore did I have to patch this:
> >>
> >> diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c
> >> index 585f25d..f374558 100644
> >> --- a/gcc/fortran/dump-parse-tree.c
> >> +++ b/gcc/fortran/dump-parse-tree.c
> >> @@ -465,7 +465,7 @@ show_expr (gfc_expr *p)
> >>   break;
> >>
> >> case BT_HOLLERITH:
> >> - fprintf (dumpfile, "%dH", p->representation.length);
> >> + fprintf (dumpfile, "%zdH", p->representation.length);
> >>   c = p->representation.string;
> >>   for (i = 0; i < p->representation.length; i++, c++)
> >> {
> >>
> >> to bootstrap on x86_64-linux/f23.  
> >
> > Ah, thanks for the catch. I'll fix it by using HOST_WIDE_INT_PRINT_DEC
> > since I'll have to change gfc_charlen_t to be a typedef form
> > HOST_WIDE_INT (see my answer to FX).
> >  
> >> And I have this regression:
> >>
> >> FAIL: gfortran.dg/allocate_deferred_char_scalar_1.f03   -O1  (test for
> >> excess errors)
> >>
> >> allocate_deferred_char_scalar_1.f03:184:0:
> >>
> >>  p = '12345679'
> >>
> >> Warning: '__builtin_memcpy' writing 8 bytes into a region of size 5
> >> overflows the destination [-Wstringop-overflow=]
> >>  allocate_deferred_char_scalar_1.f03:242:0:
> >>
> >>  p = 4_'12345679'
> >>
> >> Warning: '__builtin_memcpy' writing 32 bytes into a region of size 20
> >> overflows the destination [-Wstringop-overflow=]  
> >
> > I'm seeing that too, but I assumed they would be fixed by Paul's
> > recent patch which I don't yet have in my tree yet due to the git
> > mirror being stuck..
> >  
> >> Btw, the patch for changing the ABI of the coarray-libs is already nearly
> >> done. I just need to figure that what the state of regressions is with and
> >> without my change.  
> >
> > Thanks.
> >
> > I'll produce an updated patch with the changes discussed so far.
> >
> >
> > --
> > Janne Blomqvist  
> 
> Hi,
> 
> attached is the updated patch that applies on top of the original. I
> didn't do the charlen_zero_node etc, I just fixed the relatively few
> places in my previous patch rather than everywhere in the entire
> frontend.
> 
> Now that the git mirror is working again, I see though those warnings
> with -O1 from gfortran.dg/allocate_deferred_char_scalar_1.f03 are
> still there, and Paul's patch didn't get rid of them. :(. I have
> looked at the tree dumps, however, and AFAICS it's a bogus warning.
> 
> Ok for trunk?
> 


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
gcc/fortran/ChangeLog:

2016-12-16  Andre Vehreschild  

* gfortran.texi: Changed string length to size_t and added missing
documenation of parameters.
* trans-array.c (gfc_alloc_allocatable_for_assignment): Adapted to
  

Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-13 Thread Janne Blomqvist
On Mon, Dec 12, 2016 at 9:39 PM, Janne Blomqvist
 wrote:
> On Mon, Dec 12, 2016 at 7:39 PM, Andre Vehreschild  wrote:
>> Hi Janne,
>>
>> I found that you are favoring build_int_cst (size_type_node, 0) over
>> size_zero_node. Is there a reason to this?
>
> Yes. AFAIU size_zero_node is a zero constant for sizetype which is not
> the same as size_type_node.
>
> AFAIK the difference is that size_type_node is the C size_t type,
> whereas sizetype is a GCC internal type used for address expressions.
> On a "normal" target I understand that they are the same size, but
> there are some slight differences in semantics, e.g. size_type_node
> like C unsigned integer types is defined to wrap around on overflow
> whereas sizetype is undefined on overflow.
>
> I don't know if GCC supports some strange targets with some kind of
> segmented memory where the size of sizetype would be different from
> size_type_node, but I guess it's possible in theory at least.
>
> That being said, now that you mention in I should be using
> build_zero_cst (some_type_node) instead of
> build_int_cst(some_type_node, 0). There's also build_one_cst that I
> should use.
>
>> Furthermore did I have to patch this:
>>
>> diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c
>> index 585f25d..f374558 100644
>> --- a/gcc/fortran/dump-parse-tree.c
>> +++ b/gcc/fortran/dump-parse-tree.c
>> @@ -465,7 +465,7 @@ show_expr (gfc_expr *p)
>>   break;
>>
>> case BT_HOLLERITH:
>> - fprintf (dumpfile, "%dH", p->representation.length);
>> + fprintf (dumpfile, "%zdH", p->representation.length);
>>   c = p->representation.string;
>>   for (i = 0; i < p->representation.length; i++, c++)
>> {
>>
>> to bootstrap on x86_64-linux/f23.
>
> Ah, thanks for the catch. I'll fix it by using HOST_WIDE_INT_PRINT_DEC
> since I'll have to change gfc_charlen_t to be a typedef form
> HOST_WIDE_INT (see my answer to FX).
>
>> And I have this regression:
>>
>> FAIL: gfortran.dg/allocate_deferred_char_scalar_1.f03   -O1  (test for excess
>> errors)
>>
>> allocate_deferred_char_scalar_1.f03:184:0:
>>
>>  p = '12345679'
>>
>> Warning: '__builtin_memcpy' writing 8 bytes into a region of size 5 overflows
>>  the destination [-Wstringop-overflow=]
>>  allocate_deferred_char_scalar_1.f03:242:0:
>>
>>  p = 4_'12345679'
>>
>> Warning: '__builtin_memcpy' writing 32 bytes into a region of size 20 
>> overflows
>> the destination [-Wstringop-overflow=]
>
> I'm seeing that too, but I assumed they would be fixed by Paul's
> recent patch which I don't yet have in my tree yet due to the git
> mirror being stuck..
>
>> Btw, the patch for changing the ABI of the coarray-libs is already nearly 
>> done.
>> I just need to figure that what the state of regressions is with and without 
>> my
>> change.
>
> Thanks.
>
> I'll produce an updated patch with the changes discussed so far.
>
>
> --
> Janne Blomqvist

Hi,

attached is the updated patch that applies on top of the original. I
didn't do the charlen_zero_node etc, I just fixed the relatively few
places in my previous patch rather than everywhere in the entire
frontend.

Now that the git mirror is working again, I see though those warnings
with -O1 from gfortran.dg/allocate_deferred_char_scalar_1.f03 are
still there, and Paul's patch didn't get rid of them. :(. I have
looked at the tree dumps, however, and AFAICS it's a bogus warning.

Ok for trunk?

-- 
Janne Blomqvist
diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c
index 585f25d..7aafe54 100644
--- a/gcc/fortran/dump-parse-tree.c
+++ b/gcc/fortran/dump-parse-tree.c
@@ -348,12 +348,10 @@ show_constructor (gfc_constructor_base base)
 
 
 static void
-show_char_const (const gfc_char_t *c, int length)
+show_char_const (const gfc_char_t *c, gfc_charlen_t length)
 {
-  int i;
-
   fputc ('\'', dumpfile);
-  for (i = 0; i < length; i++)
+  for (gfc_charlen_t i = 0; i < length; i++)
 {
   if (c[i] == '\'')
fputs ("''", dumpfile);
@@ -465,7 +463,8 @@ show_expr (gfc_expr *p)
  break;
 
case BT_HOLLERITH:
- fprintf (dumpfile, "%dH", p->representation.length);
+ fprintf (dumpfile, HOST_WIDE_INT_PRINT_DEC "H",
+  p->representation.length);
  c = p->representation.string;
  for (i = 0; i < p->representation.length; i++, c++)
{
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 6a05e9e..e82c320 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -2067,10 +2067,12 @@ gfc_intrinsic_sym;
 typedef splay_tree gfc_constructor_base;
 
 
-/* This should be a size_t. But occasionally the string length field
-   is used as a flag with values -1 and -2, see
-   e.g. gfc_add_assign_aux_vars.  */
-typedef ptrdiff_t gfc_charlen_t;
+/* This should be an unsigned variable of type size_t.  But to handle
+   compiling to a 64-bit target from a 32-bit host, we need to use a
+   HOST_WIDE

Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-12 Thread Andre Vehreschild
Hi Janne,

How about adding charlen_zero_node and one_node like the others have it to 
prevent repeating ourselves?

- Andre

Am 12. Dezember 2016 20:39:38 MEZ, schrieb Janne Blomqvist 
:
>On Mon, Dec 12, 2016 at 7:39 PM, Andre Vehreschild 
>wrote:
>> Hi Janne,
>>
>> I found that you are favoring build_int_cst (size_type_node, 0) over
>> size_zero_node. Is there a reason to this?
>
>Yes. AFAIU size_zero_node is a zero constant for sizetype which is not
>the same as size_type_node.
>
>AFAIK the difference is that size_type_node is the C size_t type,
>whereas sizetype is a GCC internal type used for address expressions.
>On a "normal" target I understand that they are the same size, but
>there are some slight differences in semantics, e.g. size_type_node
>like C unsigned integer types is defined to wrap around on overflow
>whereas sizetype is undefined on overflow.
>
>I don't know if GCC supports some strange targets with some kind of
>segmented memory where the size of sizetype would be different from
>size_type_node, but I guess it's possible in theory at least.
>
>That being said, now that you mention in I should be using
>build_zero_cst (some_type_node) instead of
>build_int_cst(some_type_node, 0). There's also build_one_cst that I
>should use.
>
>> Furthermore did I have to patch this:
>>
>> diff --git a/gcc/fortran/dump-parse-tree.c
>b/gcc/fortran/dump-parse-tree.c
>> index 585f25d..f374558 100644
>> --- a/gcc/fortran/dump-parse-tree.c
>> +++ b/gcc/fortran/dump-parse-tree.c
>> @@ -465,7 +465,7 @@ show_expr (gfc_expr *p)
>>   break;
>>
>> case BT_HOLLERITH:
>> - fprintf (dumpfile, "%dH", p->representation.length);
>> + fprintf (dumpfile, "%zdH", p->representation.length);
>>   c = p->representation.string;
>>   for (i = 0; i < p->representation.length; i++, c++)
>> {
>>
>> to bootstrap on x86_64-linux/f23.
>
>Ah, thanks for the catch. I'll fix it by using HOST_WIDE_INT_PRINT_DEC
>since I'll have to change gfc_charlen_t to be a typedef form
>HOST_WIDE_INT (see my answer to FX).
>
>> And I have this regression:
>>
>> FAIL: gfortran.dg/allocate_deferred_char_scalar_1.f03   -O1  (test
>for excess
>> errors)
>>
>> allocate_deferred_char_scalar_1.f03:184:0:
>>
>>  p = '12345679'
>>
>> Warning: '__builtin_memcpy' writing 8 bytes into a region of size 5
>overflows
>>  the destination [-Wstringop-overflow=]
>>  allocate_deferred_char_scalar_1.f03:242:0:
>>
>>  p = 4_'12345679'
>>
>> Warning: '__builtin_memcpy' writing 32 bytes into a region of size 20
>overflows
>> the destination [-Wstringop-overflow=]
>
>I'm seeing that too, but I assumed they would be fixed by Paul's
>recent patch which I don't yet have in my tree yet due to the git
>mirror being stuck..
>
>> Btw, the patch for changing the ABI of the coarray-libs is already
>nearly done.
>> I just need to figure that what the state of regressions is with and
>without my
>> change.
>
>Thanks.
>
>I'll produce an updated patch with the changes discussed so far.

-- 
Andre Vehreschild * Kreuzherrenstr. 8 * 52062 Aachen
Tel.: +49 241 929 10 18 * ve...@gmx.de


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-12 Thread Janne Blomqvist
On Mon, Dec 12, 2016 at 7:39 PM, Andre Vehreschild  wrote:
> Hi Janne,
>
> I found that you are favoring build_int_cst (size_type_node, 0) over
> size_zero_node. Is there a reason to this?

Yes. AFAIU size_zero_node is a zero constant for sizetype which is not
the same as size_type_node.

AFAIK the difference is that size_type_node is the C size_t type,
whereas sizetype is a GCC internal type used for address expressions.
On a "normal" target I understand that they are the same size, but
there are some slight differences in semantics, e.g. size_type_node
like C unsigned integer types is defined to wrap around on overflow
whereas sizetype is undefined on overflow.

I don't know if GCC supports some strange targets with some kind of
segmented memory where the size of sizetype would be different from
size_type_node, but I guess it's possible in theory at least.

That being said, now that you mention in I should be using
build_zero_cst (some_type_node) instead of
build_int_cst(some_type_node, 0). There's also build_one_cst that I
should use.

> Furthermore did I have to patch this:
>
> diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c
> index 585f25d..f374558 100644
> --- a/gcc/fortran/dump-parse-tree.c
> +++ b/gcc/fortran/dump-parse-tree.c
> @@ -465,7 +465,7 @@ show_expr (gfc_expr *p)
>   break;
>
> case BT_HOLLERITH:
> - fprintf (dumpfile, "%dH", p->representation.length);
> + fprintf (dumpfile, "%zdH", p->representation.length);
>   c = p->representation.string;
>   for (i = 0; i < p->representation.length; i++, c++)
> {
>
> to bootstrap on x86_64-linux/f23.

Ah, thanks for the catch. I'll fix it by using HOST_WIDE_INT_PRINT_DEC
since I'll have to change gfc_charlen_t to be a typedef form
HOST_WIDE_INT (see my answer to FX).

> And I have this regression:
>
> FAIL: gfortran.dg/allocate_deferred_char_scalar_1.f03   -O1  (test for excess
> errors)
>
> allocate_deferred_char_scalar_1.f03:184:0:
>
>  p = '12345679'
>
> Warning: '__builtin_memcpy' writing 8 bytes into a region of size 5 overflows
>  the destination [-Wstringop-overflow=]
>  allocate_deferred_char_scalar_1.f03:242:0:
>
>  p = 4_'12345679'
>
> Warning: '__builtin_memcpy' writing 32 bytes into a region of size 20 
> overflows
> the destination [-Wstringop-overflow=]

I'm seeing that too, but I assumed they would be fixed by Paul's
recent patch which I don't yet have in my tree yet due to the git
mirror being stuck..

> Btw, the patch for changing the ABI of the coarray-libs is already nearly 
> done.
> I just need to figure that what the state of regressions is with and without 
> my
> change.

Thanks.

I'll produce an updated patch with the changes discussed so far.


-- 
Janne Blomqvist


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-12 Thread Bob Deen

> However, this will also affect people doing C->Fortran calls the
> old-fashioned way without ISO_C_BINDING, as they will have to change
> the string length argument from int to size_t in their prototypes.
> Then again, Intel Fortran did this some years ago so I guess at least
> people who care about portability to several compilers are aware.

We do a ton of this (old fashioned c-fortran binding) and changing the string 
length argument size will have a big impact on us.  We don't use the Intel 
compiler so we never noticed a change there.

Is there really a use case for strings > 2 GB that justifies the breakage?  I 
certainly understand wanting to do it "right" but I'm probably not the only one 
with practical considerations that argue against it if there are no compelling 
use cases.

Thanks...

-Bob

Bob Deen @ NASA-JPL Multimission Image Processing Lab
bob.d...@jpl.nasa.gov




Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-12 Thread Andre Vehreschild
Hi Janne,

I found that you are favoring build_int_cst (size_type_node, 0) over
size_zero_node. Is there a reason to this?

Furthermore did I have to patch this:

diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c
index 585f25d..f374558 100644
--- a/gcc/fortran/dump-parse-tree.c
+++ b/gcc/fortran/dump-parse-tree.c
@@ -465,7 +465,7 @@ show_expr (gfc_expr *p)
  break;
 
case BT_HOLLERITH:
- fprintf (dumpfile, "%dH", p->representation.length);
+ fprintf (dumpfile, "%zdH", p->representation.length);
  c = p->representation.string;
  for (i = 0; i < p->representation.length; i++, c++)
{

to bootstrap on x86_64-linux/f23.

And I have this regression:

FAIL: gfortran.dg/allocate_deferred_char_scalar_1.f03   -O1  (test for excess
errors)

allocate_deferred_char_scalar_1.f03:184:0:

 p = '12345679'
 
Warning: '__builtin_memcpy' writing 8 bytes into a region of size 5 overflows
 the destination [-Wstringop-overflow=]
 allocate_deferred_char_scalar_1.f03:242:0:

 p = 4_'12345679'
 
Warning: '__builtin_memcpy' writing 32 bytes into a region of size 20 overflows
the destination [-Wstringop-overflow=]

I also tried with a sanitized gfortran and am seeing some issues there. I have
to sort through these, but thought to let you know about the above already.

Btw, the patch for changing the ABI of the coarray-libs is already nearly done.
I just need to figure that what the state of regressions is with and without my
change.

- Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-12 Thread Janne Blomqvist
On Mon, Dec 12, 2016 at 3:35 PM, Janus Weil  wrote:
> Hi guys,
>
>> there is already an ABI change. DTIO needed it.
>
> maybe it would be a good idea to document this in places like:
> * https://gcc.gnu.org/wiki/GFortran/News
> * https://gcc.gnu.org/gcc-7/changes.html
>
> On the first page there are "Compatibility notices" for several
> earlier versions which mention stuff like this ...

Yes, absolutely. I was planning to do it when/if the patch is accepted
and merged.

-- 
Janne Blomqvist


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-12 Thread Janus Weil
Hi guys,

> there is already an ABI change. DTIO needed it.

maybe it would be a good idea to document this in places like:
* https://gcc.gnu.org/wiki/GFortran/News
* https://gcc.gnu.org/gcc-7/changes.html

On the first page there are "Compatibility notices" for several
earlier versions which mention stuff like this ...

Cheers,
Janus



> On Mon, 12 Dec 2016 11:20:06 +0100
> FX  wrote:
>
>> Hi Janne,
>>
>> This is an ABI change, so it is serious… it will require people to recompile
>> older code and libraries with the new compiler. Do we already plan to break
>> the ABI in this cycle, or is this the first ABI-breaking patch of the cycle?
>> And do we have real-life examples of character strings larger than 2 GB?
>>
>> > Also, as there are some places in the frontend were negative character
>> > lengths are used as special flag values, in the frontend the character
>> > length is handled as a signed variable of the same size as a size_t,
>> > although in the runtime library it really is size_t.
>>
>> First, I thought: we should really make it size_t, and have the negative
>> values be well-defined constants, e.g. (size_t) -1
>>
>> On the other hand, there is the problem of the case where the front-end has
>> different size_t than the target: think 32-bit on 64-bit i386 (front-end
>> size_t larger than target size_t), or cross-compiling for 64-bit on a 32-bit
>> machine (front-end size_t smaller than target size_t). So the charlen type
>> bounds need to be determined when the front-end runs, not when it is compiled
>> (i.e. it is not a fixed type).
>>
>> In iresolve.c, the "Why is this fixup needed?” comment is kinda scary.
>>
>>
>> > I haven't changed the character length variables for the co-array
>> > intrinsics, as this is something that may need to be synchronized with
>> > OpenCoarrays.
>>
>> Won’t that mean that coarray programs will fail due to ABI mismatch?
>>
>>
>> FX
>
> --
> Andre Vehreschild * Email: vehre ad gmx dot de


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-12 Thread Janne Blomqvist
On Mon, Dec 12, 2016 at 12:29 PM, Andre Vehreschild  wrote:
> I will take on the coarray ABI changes in the next days and also emit a
> pull-request to the opencoarrays to get them to sync. Janne, please wait until
> I have added those changes to prevent people from having to re-compile 
> multiple
> times.

Ok, thanks for taking care of this!


-- 
Janne Blomqvist


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-12 Thread Janne Blomqvist
On Mon, Dec 12, 2016 at 12:20 PM, FX  wrote:
> Hi Janne,
>
> This is an ABI change, so it is serious… it will require people to recompile 
> older code and libraries with the new compiler. Do we already plan to break 
> the ABI in this cycle, or is this the first ABI-breaking patch of the cycle?

As Andre mentioned, the ABI has already been broken, Gfortran 7 will
have libgfortran.so.4.

However, this will also affect people doing C->Fortran calls the
old-fashioned way without ISO_C_BINDING, as they will have to change
the string length argument from int to size_t in their prototypes.
Then again, Intel Fortran did this some years ago so I guess at least
people who care about portability to several compilers are aware.

> And do we have real-life examples of character strings larger than 2 GB?

Well, people who have needed such will have figured out some
work-around since we haven't supported it, so how would we know? :) It
could be splitting the data into several strings, or switching to
ifort, using C instead of Fortran, or something else.

In any case, I don't expect characters larger than 2 GB to be common
(particularly with the Fortran standard-mandated behaviour of
space-padding to the end in many cases), but as the ABI has been
broken anyways, we might as well fix it.

IIRC at some point there was some discussion of this on
comp.lang.fortran, and somebody mentioned analysis of genomic data as
a use case where large characters can be useful. I don't have any
personal usecase though, at least at the moment.

>> Also, as there are some places in the frontend were negative character
>> lengths are used as special flag values, in the frontend the character
>> length is handled as a signed variable of the same size as a size_t,
>> although in the runtime library it really is size_t.
>
> First, I thought: we should really make it size_t, and have the negative 
> values be well-defined constants, e.g. (size_t) -1

I tried it, but in addition to the issue with negative characters used
as flag values, there's issues like we have stuff such as
gfc_get_int_expr() that take a kind value, and an integer constant,
and produces a gfc_expr. But that doesn't understand stuff like
unsigned types. So in the end I decided it's better to get this patch
in working shape and merged with the ABI changes, then one can fix the
unsigned-ness later (in the end it's just a factor of two in sizes we
can handle, so not a huge deal).

> On the other hand, there is the problem of the case where the front-end has 
> different size_t than the target: think 32-bit on 64-bit i386 (front-end 
> size_t larger than target size_t), or cross-compiling for 64-bit on a 32-bit 
> machine (front-end size_t smaller than target size_t). So the charlen type 
> bounds need to be determined when the front-end runs, not when it is compiled 
> (i.e. it is not a fixed type).

True. Although things like gfc_charlen_type_node should be correct for
the target, the type gfc_charlen_t that I introduced in the frontend
might be too small if one is doing a 32->64 bit cross-compile. So that
should be changed from a typedef of ptrdiff_t to a typedef of
HOST_WIDE_INT which AFAIK is guaranteed to be 64-bit everywhere.

> In iresolve.c, the "Why is this fixup needed?” comment is kinda scary.

Hmm, I think it's a leftover from some earlier experimentation, should
be removed.

>> I haven't changed the character length variables for the co-array
>> intrinsics, as this is something that may need to be synchronized with
>> OpenCoarrays.
>
> Won’t that mean that coarray programs will fail due to ABI mismatch?

No, the co-array intrinsics are, well, intrinsics, so they're handled
specially in the frontend and don't need to follow the normal
argument-passing conventions. But I think it'd be easier if they did,
and might prevent some obscure corner-case bugs. Say, create a
character variable with length 2**31+9, then typecasting to plain int
when calling the intrinsic would wrap around and the library would see
a negative length.



-- 
Janne Blomqvist


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-12 Thread Andre Vehreschild
Hi FX,

there is already an ABI change. DTIO needed it.

I will take on the coarray ABI changes in the next days and also emit a
pull-request to the opencoarrays to get them to sync. Janne, please wait until
I have added those changes to prevent people from having to re-compile multiple
times.

- Andre

On Mon, 12 Dec 2016 11:20:06 +0100
FX  wrote:

> Hi Janne,
> 
> This is an ABI change, so it is serious… it will require people to recompile
> older code and libraries with the new compiler. Do we already plan to break
> the ABI in this cycle, or is this the first ABI-breaking patch of the cycle?
> And do we have real-life examples of character strings larger than 2 GB?
> 
> > Also, as there are some places in the frontend were negative character
> > lengths are used as special flag values, in the frontend the character
> > length is handled as a signed variable of the same size as a size_t,
> > although in the runtime library it really is size_t.  
> 
> First, I thought: we should really make it size_t, and have the negative
> values be well-defined constants, e.g. (size_t) -1
> 
> On the other hand, there is the problem of the case where the front-end has
> different size_t than the target: think 32-bit on 64-bit i386 (front-end
> size_t larger than target size_t), or cross-compiling for 64-bit on a 32-bit
> machine (front-end size_t smaller than target size_t). So the charlen type
> bounds need to be determined when the front-end runs, not when it is compiled
> (i.e. it is not a fixed type).
> 
> In iresolve.c, the "Why is this fixup needed?” comment is kinda scary.
> 
> 
> > I haven't changed the character length variables for the co-array
> > intrinsics, as this is something that may need to be synchronized with
> > OpenCoarrays.  
> 
> Won’t that mean that coarray programs will fail due to ABI mismatch?
> 
> 
> FX

-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-12 Thread FX
Hi Janne,

This is an ABI change, so it is serious… it will require people to recompile 
older code and libraries with the new compiler. Do we already plan to break the 
ABI in this cycle, or is this the first ABI-breaking patch of the cycle? And do 
we have real-life examples of character strings larger than 2 GB?

> Also, as there are some places in the frontend were negative character
> lengths are used as special flag values, in the frontend the character
> length is handled as a signed variable of the same size as a size_t,
> although in the runtime library it really is size_t.

First, I thought: we should really make it size_t, and have the negative values 
be well-defined constants, e.g. (size_t) -1

On the other hand, there is the problem of the case where the front-end has 
different size_t than the target: think 32-bit on 64-bit i386 (front-end size_t 
larger than target size_t), or cross-compiling for 64-bit on a 32-bit machine 
(front-end size_t smaller than target size_t). So the charlen type bounds need 
to be determined when the front-end runs, not when it is compiled (i.e. it is 
not a fixed type).

In iresolve.c, the "Why is this fixup needed?” comment is kinda scary.


> I haven't changed the character length variables for the co-array
> intrinsics, as this is something that may need to be synchronized with
> OpenCoarrays.

Won’t that mean that coarray programs will fail due to ABI mismatch?


FX

[PATCH] PR 78534 Change character length from int to size_t

2016-12-12 Thread Janne Blomqvist
In order to handle large character lengths on (L)LP64 targets, switch
the GFortran character length from an int to a size_t.

This is an ABI change, as procedures with character arguments take
hidden arguments with the character length.

I also changed the _size member in vtables from int to size_t, as
there were some cases where character lengths and sizes were
apparently mixed up and caused regressions otherwise. Although I
haven't tested, this might enable very large derived types as well.

Also, as there are some places in the frontend were negative character
lengths are used as special flag values, in the frontend the character
length is handled as a signed variable of the same size as a size_t,
although in the runtime library it really is size_t.

I haven't changed the character length variables for the co-array
intrinsics, as this is something that may need to be synchronized with
OpenCoarrays.

A caveat here is the testcase char_result_8.f90, which I changed
slightly to work around PR 78757. It's a somewhat obscure corner case,
and while this patch admittedly makes it more likely to trigger this
bug, it's, well, still a corner case.

Regtested on x86_64-pc-linux-gnu, Ok for trunk?

frontend:

2016-12-12  Janne Blomqvist  

PR fortran/78534
PR fortran/66310
* arith.c (gfc_check_charlen_range): New function.
(gfc_range_check): Use gfc_check_charlen_range.
* class.c (gfc_find_derived_vtab): Use gfc_size_kind instead of
hardcoded kind.
(find_intrinsic_vtab): Likewise.
* expr.c (gfc_get_character_expr): Length parameter of type
gfc_charlen_t.
(gfc_get_int_expr): Value argument of type long.
(gfc_extract_long): New function.
* gfortran.h (gfc_typespec): New member is_charlen.
(gfc_charlen_t): New typedef.
(gfc_expr): Use gfc_charlen_t for character lengths.
(gfc_size_kind): New extern variable.
(gfc_extract_long): New prototype.
(gfc_get_character_expr): Use gfc_charlen_t for character length.
(gfc_get_int_expr): Use long type for value argument.
* iresolve.c (gfc_resolve_repeat): Use gfc_charlen_t,
gfc_charlen_int_kind, set is_charlen.
* match.c (select_intrinsic_set_tmp): Use long for charlen.
* module.c (atom_int): Change type from int to HOST_WIDE_INT.
(parse_integer): Don't complain about large integers.
(write_atom): Use HOST_WIDE_INT for integers.
(mio_integer): Handle integer type mismatch.
(mio_hwi): New function.
(mio_intrinsic_op): Use HOST_WIDE_INT.
(mio_array_ref): Likewise.
(mio_expr): Likewise.
* resolve.c (resolve_substring): Use get_type_static_bounds.
(resolve_select_type): Use long for charlen.
(resolve_charlen): Use long for charlen, get_type_static_bounds.
* simplify.c (gfc_simplify_repeat): Likewise.
* target-memory.c (gfc_interpret_character): Use gfc_charlen_t.
* trans-array.c (get_array_ctor_var_strlen): Use
gfc_conv_mpz_to_tree_type.
* trans-const.c (gfc_conv_mpz_to_tree_type): New function.
* trans-const.h (gfc_conv_mpz_to_tree_type): New prototype.
* trans-decl.c (create_function_arglist): Assert that length is
not NULL_TREE.
* trans-expr.c (gfc_class_len_or_zero_get): Build const of type
gfc_charlen_type_node.
(gfc_conv_intrinsic_to_class): Use gfc_charlen_int_kind instead of
4, fold_convert to correct type.
(gfc_conv_class_to_class): Build const of type size_type_node for
size.
(gfc_copy_class_to_class): Likewise.
(gfc_conv_string_length): Use same type in expression.
(gfc_conv_substring): Likewise, use long for charlen.
(gfc_conv_string_tmp): Make sure len is of the right type.
(gfc_conv_concat_op): Use same type in expression.
(gfc_conv_procedure_call): Likewise.
(alloc_scalar_allocatable_for_subcomponent_assignment):
fold_convert to right type.
(gfc_trans_subcomponent_assign): Likewise.
(trans_class_vptr_len_assignment): Build const of correct type.
(gfc_trans_pointer_assignment): Likewise.
(alloc_scalar_allocatable_for_assignment): fold_convert to right
type in expr.
(trans_class_assignment): Build const of correct type.
* trans-intrinsic.c (gfc_conv_associated): Likewise.
(gfc_conv_intrinsic_repeat): Do calculation in sizetype.
* trans-io.c (gfc_build_io_library_fndecls): Use
gfc_charlen_type_node for character lengths.
* trans-stmt.c (gfc_trans_label_assign): Build const of
gfc_charlen_type_node.
(gfc_trans_character_select): Likewise.
(gfc_trans_allocate): Likewise, don't typecast strlen result.
(gfc_trans_deallocate): Don't typecast strlen result.
* trans-types.c (gfc_size_kind): New variable.
(gfc_init_types)