https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112652

--- Comment #6 from ro at CeBiTec dot Uni-Bielefeld.DE <ro at CeBiTec dot 
Uni-Bielefeld.DE> ---
> --- Comment #5 from ro at CeBiTec dot Uni-Bielefeld.DE <ro at CeBiTec dot
> Uni-Bielefeld.DE> ---
>> --- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
>> Given that C++ says e.g. in https://eel.is/c++draft/lex.ccon#3.1
>> that program is ill-formed if some character lacks encoding in the execution
>> character set, I'm afraid the Solaris iconv behavior results in violation of

Although I can barely wrap my head around the standardese there, I had a
look at n4928 (the last? C++23 draft), which has a different wording
here (p.25, 5.13.3):

(3.1) — A character-literal with a c-char-sequence consisting of a
         single basic-c-char, simple-escape-sequence, or
         universal-character-name is the code unit value of the
         specified character as encoded in the literal’s associated
         character encoding.

         [Note 2 : If the specified character lacks representation in
         the literal’s associated character encoding or if it cannot be
         encoded as a single code unit, then the literal is a
         non-encodable character literal. —end note

> I've not yet tried to understand what either iconv(3) has to say on the
> matter.

Digging further, Solaris iconv(3C) has

       If  iconv()  encounters  a character in the input buffer that is legal,
       but for which an identical character does not exist in the target  code
       set,  iconv()  performs  an  implementation-defined  conversion on this
       character.

which exactly matches XPG7, so the behaviour seems to be in line with
the standards.

I've also found that Solaris 11 has iconvctl(3C) (obviously patterened
after GNU libiconv) with

       ICONV_SET_TRANSLITERATE

           With  this  request  and  a  pointer to a const int with a non-zero
           value, caller can instruct the current conversion to  transliterate
           non-identical characters from the input buffer during the code con-
           version  as  much  as it can. The value of zero, on the other hand,
           turns it off.

However,

        int transliterate = 0;
        iconvctl (cd, ICONV_SET_TRANSLITERATE, &transliterate);

doesn't make a difference.

The current Solaris iconv behaviour certainly isn't particularly
intuitive and I'll ask the Solaris engineers about it.  However, there's
the question what to do about the testcase?  Just xfail it on Solaris or
omit just the two affected subtests there?

Reply via email to