> > I think keeping it untranslated is fine for now. Any translation
> > if really needed would be a separate feature.
> 
> I mean, unless you make extra effort, it is translated.
> Even in your current version, try constexpr *foo () { return "nop"; }
> and you'll see that it actually results in "\x95\x96\x97" with
> -fexec-charset=EBCDICUS.

Perhaps I don't understand the use case for this option.
Would it ever be used on a non EBCDIC system?

> What is worse, constexpr *bar () { return "%0 %1"; }
> results in "\x6c\xf0\x40\x6c\xf1", so the compiler will not be able to find
> the % special characters in there etc.
> The parsing of the string literal in asm definitions uses translate=false
> to avoid the translations.
> As the static_assert paper says, for static_assert it isn't that big a deal,
> the program is already UB if it diagnoses static assertion failure, worst
> case it prints garbage if one plays with -fexec-charset=.  But for inline
> asm it would fail to compile...
> 
> So, the extension really should be well defined vs. the character set,
> either it should be characters in the execution charset and the FE would
> need to ask libcpp to translate it back, or it would need to be declared
> to be e.g. in UTF-8 regardless of the charset (like u8'x' or u8"abc"
> literals are; but then shouldn't the _M_data in that case return a pointer
> to char8_t instead), something else?

Okay then we can always translate to UTF-8.

-Andi

Reply via email to