> > I think keeping it untranslated is fine for now. Any translation > > if really needed would be a separate feature. > > I mean, unless you make extra effort, it is translated. > Even in your current version, try constexpr *foo () { return "nop"; } > and you'll see that it actually results in "\x95\x96\x97" with > -fexec-charset=EBCDICUS.
Perhaps I don't understand the use case for this option. Would it ever be used on a non EBCDIC system? > What is worse, constexpr *bar () { return "%0 %1"; } > results in "\x6c\xf0\x40\x6c\xf1", so the compiler will not be able to find > the % special characters in there etc. > The parsing of the string literal in asm definitions uses translate=false > to avoid the translations. > As the static_assert paper says, for static_assert it isn't that big a deal, > the program is already UB if it diagnoses static assertion failure, worst > case it prints garbage if one plays with -fexec-charset=. But for inline > asm it would fail to compile... > > So, the extension really should be well defined vs. the character set, > either it should be characters in the execution charset and the FE would > need to ask libcpp to translate it back, or it would need to be declared > to be e.g. in UTF-8 regardless of the charset (like u8'x' or u8"abc" > literals are; but then shouldn't the _M_data in that case return a pointer > to char8_t instead), something else? Okay then we can always translate to UTF-8. -Andi