Hi Olivier, On 08/06/18 20:01, Olivier Hainque wrote: > Hi Bernd, > > Things are moving fast so I'll answer your > messages incrementally :-) > >> On 3 Aug 2018, at 20:37, Bernd Edlinger <bernd.edlin...@hotmail.de> wrote: > [...] >> Oh, how interesting. >> >> My patch only intends to add a dummy byte that is stripped again >> in the assembly. The purpose is to make it trivial to find out if >> a string constant is NUL-terminated in the middle-end by comparing >> TYPE_SIZE_UNIT (TREE_TYPE (decl)) with TREE_STRING_LENGTH (decl). > > Ah, Ok. I hadn't quite realized that yet. > >> TYPE_SIZE_UNIT >= TREE_STRING_LENGTH means zero terminated >> TYPE_SIZE_UNIT < TREE_STRING_LENGTH means not necessarily zero terminated, > > That is opposite to the situation we were having > for Ada before the patch series: > >> Such strings shall never chop off more than one nul character. >> Ada strings are generally not zero terminated, right? > > Right. Strings have explicit bounds in Ada, and a NUL character > is like any other. It can appear anywhere. For example, the > excerpt below constructs a string of 5 characters, bounds 1 to 5, > with a nul character in second position: > > X : String (1 .. 5) := "1" & Ascii.Nul & "345"; > > ('&' is the catenation operator) > > The problem we had was that string literals not explicitly > NUL terminated (as in X : String := "123" & Ascii.NUL) would > never go in mergeable sections. For example: > > Ada.Text_IO.Put_Line ("Hello World!"); > > Ada has so called "generic" units that are sometimes > instanciated many many times so there is real interest > in improving the situation there. > >> So in your search loop you would not have to look after >> type_len, because that byte would be guaranteed to be zero. >> >> That is if we agree that we want to restrict the string constants >> in that way as I proposed. >> >> In the case str_len > type_len I do not understand if that is right. > > In your new model, I don't know what sense it would make, indeed. > > In the previous model, it was clearly expected to be a possibility. > >> Because varasm will only output type_size bytes, >> this seems only to select a string section >> but the last nul byte will not be assembled. >> Something that is not contained in the type_size >> should not be assembled. >> >> What exactly do you want to accomplish? > > Based on the previous (say, gcc-7 or gcc-8) code base, arrange > for most Ada string literals to end up in a mergeable section > on ELF targets. > > In at least gcc-7 and gcc-8, our gigi adjustment + the > patch I posted achieve this effect. > > For example, for the following source: > > procedure Hello is > procedure Process (S : String) with Import, Convention => C; > begin > Process ("1234"); > end; >
Yes, thanks for this example, $ cat hello.adb procedure Hello is procedure puts (S : String) with Import, Convention => C; X : String(1..8) := "abcdefg" & Ascii.Nul; begin puts ("1234" & Ascii.Nul ); puts (X); end; produces again identical output with the latest patch that I posted right now. > Without the gigi change, I get (x86_64-linux, > -O2 -fmerge-all-constants): > > .section .rodata > .LC0: > .ascii "1234" > > Regular section, no terminating zero. > > The gigi change alone gets us the terminating nul (*), > but not yet a mergeable section, because the extra nul > isn't covered by the type bounds: > > .section .rodata > .LC0: > .string "1234" > > With the varasm change on top, checking beyond the > type bounds when the string constant isn't an initializer, > we get: > > .section .rodata.str1.1,"aMS",@progbits,1 > .LC0: > .string "1234" > > (*) The terminating zero, part of the string value, > not spanned by the type bounds, gets assembled through: > > assemble_constant_contents (...) > ... > size = get_constant_size (exp); > > then: > > get_constant_size (tree exp) > > size = int_size_in_bytes (TREE_TYPE (exp)); > if (TREE_CODE (exp) == STRING_CST) > size = MAX (TREE_STRING_LENGTH (exp), size); > > Again, this is just providing more context on the > original patch I posted, based on your questions. > Ah, good point. That needs to be changed... Actually I would like to have get_constant_size return just the type size. > I understand there are proposals to change things > pretty significantly in this area, so ... > > now on to your next messages and ideas :-) > > When I try this example: $ cat array9.adb -- { dg-do run } procedure Array9 is V1 : String(1..10) := "1234567890"; V2 : String(1..-1) := ""; procedure Compare (S : String) is begin if S'Size /= 8*S'Length then raise Program_Error; end if; end; begin Compare (""); Compare ("1234"); Compare (V1); Compare (V2); end; I see that "1234" is put in the merge section, but V1 is not. Maybe because of the alignment requirement? But it sould not be much different from the C test case, which is now able to merge the non-zero terminated strings. Thanks Bernd.