On Thu, Oct 19, 2017 at 7:13 PM, Martin Sebor <mse...@gmail.com> wrote:
> On 10/19/2017 09:50 AM, Andreas Krebbel wrote:
>>
>> The TPF operating system uses the GCC S/390 backend.  They set an
>> EBCDIC exec charset for compilation using -fexec-charset.  However,
>> certain libraries require ASCII strings instead.  In order to be able
>> to put calls to that library into the normal code it is required to
>> switch the exec charset within a compilation unit.
>>
>> This is an attempt to implement it by adding a new pragma which could
>> be used like in the following example:
>>
>> int
>> foo ()
>> {
>>   call_with_utf8("hello world");
>>
>> #pragma GCC exec_charset("UTF16")
>>   call_with_utf16("hello world");
>>
>> #pragma GCC exec_charset(pop)
>>   call_with_utf8("hello world");
>> }
>>
>> Does this look reasonable?
>
>
> I'm not an expert on this but at a high level it looks reasonable
> to me.  But based on some small amount of work I did in this area
> I have a couple of questions.
>
> There are a few places in the compiler that already do or that
> should but don't yet handle different execution character sets.
> The former include built-ins like __bultin_isdigit() and
> __builtin_sprintf (in both builtins.c and gimple-ssa-sprintf.c)
> The latter is the -Wformat checking done by the C and C++ front
> ends.  The missing support for the latter is the subject of bug
> 38308.  According to bug 81686, LTO is apparently also missing
> support for exec-charset.
>
> I'm curious how the pragma might interact with these two areas,
> and whether the lack of support for it in the latter is a concern
> (and if not, why not).  For the former, I'm also wondering about
> the interaction of inlining and other interprocedural optimizations
> with the pragma.  Does it propagate through inlined calls as one
> would expect?

How does it work semantically to have different exec charsets?  That is,
if "strings" flow from a region with one -fexec-charset setting to a region
with another one is that undefined behavior?  Do we now require
external function declarations to be in the proper region (declared under
the appropriate exec charset flag)?  This would mean that passing
the exec charset in effect as additional argument isn't a possibility.

Or do we have to treat -fexec-charset similar to -frounding-math, that is,
we can't ever _interpret_ any string in the compiler?  [unless -fexec-charset
is the same everywhere]

I think the -frounding-math route is probably the easiest (and wisest
given the quite low test coverage we'll get) route.  Thus, add a -fmixed-charset
flag and reject any exec-charset attribute/pragma if that flag is not set?
With LTO we could always add this and/or merge -fexec-charset flags
appropriately,
injecting -fmixed-charset in case TUs use different settings.

Richard.


> Thanks
> Martin
>

Reply via email to