[Bug fortran/66499] Letters with accents change format behavior for X and T descriptors.

jvdelisle at gcc dot gnu.org via Gcc-bugs Sat, 24 Feb 2024 10:26:35 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66499


--- Comment #7 from Jerry DeLisle <jvdelisle at gcc dot gnu.org> ---
There two issues going on here. We do not interpret source code that is UTF-8
encoded.  This is why in our current tests for UTF-8 encoding of data files we
us hexidecimal codes.

I will have to see what the standard says about non=ASCII character sets in
source code.

If I get around this by using something like this:

char1 = 4_"Test without local char"
char2 = 4_"Test with local char "

char2(22:22) = 4_"Ã"
char2(23:23) = 4_"Ã"

$ ./a.out 
          23
          23
1234567890123456789012345678901234567890
  Test without local char              10.0000
  Test with local char ÃÃ            10.0000

The string lengths now match correctly.  One can see the tabbing is still off. 
This is because the format buffer seek functions are byte oriented and when
using UTF-8 encoding we need to seek the buffer differently. In fact we have to
allocate it differently as well to maintain the four byte characters.

[Bug fortran/66499] Letters with accents change format behavior for X and T descriptors.

Reply via email to