Re: [patch, fortran] PR93461 - Bogus "symbol is already defined" with long subroutine names in submodule

2020-01-28 Thread Andrew Benson
Hi Tobias,

Thanks - committed as 

https://gcc.gnu.org/git/gitweb.cgi?
p=gcc.git;a=commit;h=ad690d79cfbb905c5546c9333c5fd089d906505b

I'll send a separate email about changes to the release notes.

-Andrew

On Tuesday, January 28, 2020 6:49:54 PM PST Tobias Burnus wrote:
> Hi Andrew,
> 
> On 1/28/20 5:41 PM, Andrew Benson wrote:
> >> Can you also update the comment before the #define? It currently has:
> > Thanks for the suggestion. An updated patch is attached.
> 
> Thanks. LGTM now.
> 
> >> PS: I wonder whether there are relevant systems which will fail because
> >> they do not handle that long symbol names...
> > 
> > Good question. Should I hold off on committing the patch until this can be
> > tested further? Or should I just go ahead and commit and deal with any
> > such
> > problems if they show up?
> 
> I think it is fine – as a user, one can always choose a shorter name if
> the system one is interested in doesn't support it. Besides, following
> the current scheme, we do not really have a choice without breaking the ABI.
> > Also, Richard Biener raised the question (in the PR) of whether this patch
> > would be an ABI change. I can see that it probably would be, but don't
> > know
> > for sure. If it would change the ABI is there anything else that I need to
> > include in the patch?
> 
> As just mentioned on the PR, I think it is a small ABI change – before a
> very long identified got cut off now it is available in full extent. — I
> do not see any simple way to avoid this ABI change and, frankly, I also
> do not think any real-world program uses that long identifiers.
> 
> Namely, instead of using 3 times the length of 42 character
> ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnop") the standard permits 3
> times the length of 63 characters
> (ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890_).
> Already 42 is quite long for a module, submodule or procedure name. And
> that does work even without the patch. And as only the sum counts, if
> one part only uses 10 characters (e.g. "my_module5"), now 2*58
> characters available two others.
> 
> If at the end it really causes problems (source code not available), the
> user still can modify the module file and change the procedure name to
> the trimmed value and continue.
> 
> Thus, I do not think it is a real problem in practice. We could be nice,
> however, and add a note to the release notes (i.e.
> https://gcc.gnu.org/gcc-10/changes.html); I not completely sure whether
> it is worthwhile but why not. [See https://gcc.gnu.org/about.html#git
> about how to change WWW files and CC Gerald as web maintainer when
> submitting wwwdocs patches.]
> 
> Thanks,
> 
> Tobias
> 
> > patch.diff
> > 
> > diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h
> > index 52bc045..69171f3 100644
> > --- a/gcc/fortran/trans.h
> > +++ b/gcc/fortran/trans.h
> > @@ -23,8 +23,8 @@ along with GCC; see the file COPYING3.  If not see
> > 
> >   #include "predict.h"  /* For enum br_predictor and PRED_*.  */
> > 
> > -/* Mangled symbols take the form __module__name.  */
> > -#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*2+4)
> > +/* Mangled symbols take the form __module__name or
> > __module.submodule__name.  */ +#define GFC_MAX_MANGLED_SYMBOL_LEN 
> > (GFC_MAX_SYMBOL_LEN*3+5)
> > 
> >   /* Struct for holding a block of statements.  It should be treated as an
> >   
> >  opaque entity and not modified directly.  This allows us to change
> >  the
> > 
> > diff --git a/gcc/testsuite/gfortran.dg/pr93461.f90
> > b/gcc/testsuite/gfortran.dg/pr93461.f90 new file mode 100644
> > index 000..3bef326
> > --- /dev/null
> > +++ b/gcc/testsuite/gfortran.dg/pr93461.f90
> > @@ -0,0 +1,22 @@
> > +! { dg-do compile }
> > +! PR fortran/93461
> > +module aModuleWithAnAllowedName
> > +  interface
> > + module subroutine aShortName()
> > + end subroutine aShortName
> > +  end interface
> > +end module aModuleWithAnAllowedName
> > +
> > +submodule (aModuleWithAnAllowedName)
> > aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName +contains
> > +  subroutine aShortName()
> > +call aSubroutineWithAVeryLongNameThatWillCauseAProblem()
> > +call aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
> > +  end subroutine aShortName
> > +
> > +  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem()
> > +  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem
> > +
> > +  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
> > +  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso
> > +end submodule aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName
> > 
> > ChangeLog
> > 
> > 2020-01-28  Andrew Benson
> > 
> > PR fortran/93461
> > * trans.h: Increase GFC_MAX_MANGLED_SYMBOL_LEN to
> > GFC_MAX_SYMBOL_LEN*3+5 to allow for inclusion of submodule name,
> > plus the "." between module and submodule names.
> > 
> > * gfortran.dg/pr93461.f90: New test.


-- 

* Andrew Benson: http://

Re: [patch, fortran] PR93461 - Bogus "symbol is already defined" with long subroutine names in submodule

2020-01-28 Thread Tobias Burnus

Hi Andrew,

On 1/28/20 5:41 PM, Andrew Benson wrote:

Can you also update the comment before the #define? It currently has:

Thanks for the suggestion. An updated patch is attached.

Thanks. LGTM now.

PS: I wonder whether there are relevant systems which will fail because they
do not handle that long symbol names...

Good question. Should I hold off on committing the patch until this can be
tested further? Or should I just go ahead and commit and deal with any such
problems if they show up?


I think it is fine – as a user, one can always choose a shorter name if 
the system one is interested in doesn't support it. Besides, following 
the current scheme, we do not really have a choice without breaking the ABI.



Also, Richard Biener raised the question (in the PR) of whether this patch
would be an ABI change. I can see that it probably would be, but don't know
for sure. If it would change the ABI is there anything else that I need to
include in the patch?


As just mentioned on the PR, I think it is a small ABI change – before a 
very long identified got cut off now it is available in full extent. — I 
do not see any simple way to avoid this ABI change and, frankly, I also 
do not think any real-world program uses that long identifiers.


Namely, instead of using 3 times the length of 42 character 
("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnop") the standard permits 3 
times the length of 63 characters 
(ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890_). 
Already 42 is quite long for a module, submodule or procedure name. And 
that does work even without the patch. And as only the sum counts, if 
one part only uses 10 characters (e.g. "my_module5"), now 2*58 
characters available two others.


If at the end it really causes problems (source code not available), the 
user still can modify the module file and change the procedure name to 
the trimmed value and continue.


Thus, I do not think it is a real problem in practice. We could be nice, 
however, and add a note to the release notes (i.e. 
https://gcc.gnu.org/gcc-10/changes.html); I not completely sure whether 
it is worthwhile but why not. [See https://gcc.gnu.org/about.html#git 
about how to change WWW files and CC Gerald as web maintainer when 
submitting wwwdocs patches.]


Thanks,

Tobias


patch.diff

diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h
index 52bc045..69171f3 100644
--- a/gcc/fortran/trans.h
+++ b/gcc/fortran/trans.h
@@ -23,8 +23,8 @@ along with GCC; see the file COPYING3.  If not see
  
  #include "predict.h"  /* For enum br_predictor and PRED_*.  */
  
-/* Mangled symbols take the form __module__name.  */

-#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*2+4)
+/* Mangled symbols take the form __module__name or __module.submodule__name.  
*/
+#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*3+5)
  
  /* Struct for holding a block of statements.  It should be treated as an

 opaque entity and not modified directly.  This allows us to change the
diff --git a/gcc/testsuite/gfortran.dg/pr93461.f90 
b/gcc/testsuite/gfortran.dg/pr93461.f90
new file mode 100644
index 000..3bef326
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr93461.f90
@@ -0,0 +1,22 @@
+! { dg-do compile }
+! PR fortran/93461
+module aModuleWithAnAllowedName
+  interface
+ module subroutine aShortName()
+ end subroutine aShortName
+  end interface
+end module aModuleWithAnAllowedName
+
+submodule (aModuleWithAnAllowedName) 
aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName
+contains
+  subroutine aShortName()
+call aSubroutineWithAVeryLongNameThatWillCauseAProblem()
+call aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
+  end subroutine aShortName
+
+  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem()
+  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem
+
+  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
+  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso
+end submodule aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName

ChangeLog

2020-01-28  Andrew Benson

PR fortran/93461
* trans.h: Increase GFC_MAX_MANGLED_SYMBOL_LEN to
GFC_MAX_SYMBOL_LEN*3+5 to allow for inclusion of submodule name,
plus the "." between module and submodule names.

* gfortran.dg/pr93461.f90: New test.


Re: [patch, fortran] PR93461 - Bogus "symbol is already defined" with long subroutine names in submodule

2020-01-28 Thread Andrew Benson
Hi Tobias,

> > The problem occurs because GFC_MAX_MANGLED_SYMBOL_LEN is set to
> > GFC_MAX_SYMBOL_LEN*2+4, which is sufficient for a module name plus
> > function name (plus the additional "_"'s that get prepended), but
> > insufficient if a submodule name is included. The name then gets
> > truncated and can lead to two different functions having the same
> > (truncated) symbol name.
> > 
> > The fix is to increase this length to GFC_MAX_SYMBOL_LEN*3+5 - which
> > allows for the submodule name plus the "." added between module and
> > submodule names.
> > 
> > I've attached a patch for this which includes a new test case for this PR.
> > The patch regression tests cleanly.
> > 
> > OK to commit?
> 
> Can you also update the comment before the #define? It currently has:
> 
>   /* Mangled symbols take the form __module__name.  */
> -#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*2+4)
> +#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*3+5)

Thanks for the suggestion. An updated patch is attached.

> PS: I wonder whether there are relevant systems which will fail because they
> do not handle that long symbol names...

Good question. Should I hold off on committing the patch until this can be 
tested further? Or should I just go ahead and commit and deal with any such 
problems if they show up?

Also, Richard Biener raised the question (in the PR) of whether this patch 
would be an ABI change. I can see that it probably would be, but don't know 
for sure. If it would change the ABI is there anything else that I need to 
include in the patch?

Thanks,
Andrew

-- 

* Andrew Benson: http://users.obs.carnegiescience.edu/abenson/contact.html

* Galacticus: https://github.com/galacticusorg/galacticus
diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h
index 52bc045..69171f3 100644
--- a/gcc/fortran/trans.h
+++ b/gcc/fortran/trans.h
@@ -23,8 +23,8 @@ along with GCC; see the file COPYING3.  If not see
 
 #include "predict.h"  /* For enum br_predictor and PRED_*.  */
 
-/* Mangled symbols take the form __module__name.  */
-#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*2+4)
+/* Mangled symbols take the form __module__name or __module.submodule__name.  */
+#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*3+5)
 
 /* Struct for holding a block of statements.  It should be treated as an
opaque entity and not modified directly.  This allows us to change the
diff --git a/gcc/testsuite/gfortran.dg/pr93461.f90 b/gcc/testsuite/gfortran.dg/pr93461.f90
new file mode 100644
index 000..3bef326
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr93461.f90
@@ -0,0 +1,22 @@
+! { dg-do compile }
+! PR fortran/93461
+module aModuleWithAnAllowedName
+  interface
+ module subroutine aShortName()
+ end subroutine aShortName
+  end interface
+end module aModuleWithAnAllowedName
+
+submodule (aModuleWithAnAllowedName) aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName
+contains
+  subroutine aShortName()
+call aSubroutineWithAVeryLongNameThatWillCauseAProblem()
+call aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
+  end subroutine aShortName
+  
+  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem()
+  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem
+
+  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
+  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso  
+end submodule aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName
2020-01-28  Andrew Benson  

	PR fortran/93461
	* trans.h: Increase GFC_MAX_MANGLED_SYMBOL_LEN to
	GFC_MAX_SYMBOL_LEN*3+5 to allow for inclusion of submodule name,
	plus the "." between module and submodule names.

	* gfortran.dg/pr93461.f90: New test.


Re: [patch, fortran] PR93461 - Bogus "symbol is already defined" with long subroutine names in submodule

2020-01-28 Thread Tobias Burnus

Hi Andrew,

On 1/27/20 9:57 PM, Andrew Benson wrote:

The problem occurs because GFC_MAX_MANGLED_SYMBOL_LEN is set to
GFC_MAX_SYMBOL_LEN*2+4, which is sufficient for a module name plus function name
(plus the additional "_"'s that get prepended), but insufficient if a submodule
name is included. The name then gets truncated and can lead to two different
functions having the same (truncated) symbol name.

The fix is to increase this length to GFC_MAX_SYMBOL_LEN*3+5 - which allows for
the submodule name plus the "." added between module and submodule names.

I've attached a patch for this which includes a new test case for this PR. The
patch regression tests cleanly.

OK to commit?


Can you also update the comment before the #define? It currently has:

 /* Mangled symbols take the form __module__name.  */
-#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*2+4)
+#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*3+5)

Otherweise, it LGTM.

Tobias

PS: I wonder whether there are relevant systems which will fail because they
do not handle that long symbol names...