Harald van Dijk wrote in <9aa0b43f-c5de-1698-9f34-c725a40e6...@gigawatt.nl>: |On 12/05/2022 23:10, Steffen Nurpmeso wrote: |> Harald van Dijk wrote in |> <bd336669-960b-1f5f-fffc-30905d4c8...@gigawatt.nl>: |>|On 12/05/2022 18:19, Steffen Nurpmeso via austin-group-l at The Open |>|Group wrote: |>|> Bruno Haible wrote in |>|> <4298913.vrqWZg68TM@omega>: |>|>|Steffen Nurpmeso wrote: |>|>|> ... |>|>|>| [.] "UTF-7"." |>|>|> |>|>|> That is overshoot. |>|>| |>|>|No. UTF-7 is invalid here because it produces output that is not NUL |>|>|terminated. See: |>|>| |>|>|$ printf 'ab\0' | iconv -t UTF-7 | od -t c |>|>|0000000 a b + A A A - |>|>|0000007 |>|>| |>|>|strlen() on such a return value makes invalid memory accesses. |>|>|You can convince yourself by running |>|>|$ OUTPUT_CHARSET=UTF-7 valgrind ls --help |>|> |>|> This is then surely bogus? UTF-7 is a normal single byte |>|> character set and is to be terminated like anything else. Nothing |>|> in RFC 2152 nor RFC 3501 if you want makes me think something |>|> else. |>| |>|RFC 2152's rules 1 and 3 only allow specifying the listed characters as |>|their ASCII form. All other characters, including U+0000, must be |>|encoded using rule 2. GNU iconv is doing what the RFC specifies here. |> |> No really, please. And please do not strip important content, | |I didn't think I did. You didn't read the RFC properly, I replied to
You again strip content of follow-up RFCs. I have implemented UTF-7, and i definitely terminate C-style strings. ... |> LC_ALL=C printf 'ab\0' | iconv -f iso-8859-1 -t utf-16 | od -t c |> 0000000 \0 \0 a \0 b \0 \0 \0 |> |> Two leading NULs? | |This is not what GNU iconv prints at all, at least not on my system, |which just uses the GNU version unmodified. Rather, it prints Interesting. Unmodified here too. Bruno Haible contacted me in private, i gave him all i have. ... |you may want to report this, including steps on how to get a GNU iconv I have given up on reporting bugs on sourceware bug tracker. The reason is on this list i think. I skip the rest. --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)