The point is, such a conversion is utterly meaningless. The code that is trying to do this is *wrong*.
I don't have sufficient context to understand why it's doing this, but I *suspect* that the more correct code is to just convert this to a multibyte string, rather than to UTF-8 specifically. (In UTF-8 locales, the multi-byte output will actually be UTF-8. If you're in another locale... its of course different.) If I'm right, the function should be replaced with a call to wcstombs() (or one of the variants, such as wcstombs_r or wcstombs_l.) In fact, if you *know* the incoming data is UCS-4, you *could* use newlocale() to get a UTF-8 locale object (en_US.UTF-8 is your best bet, its probably installed on pretty much every system), and then use wcsrtombs_l() to do a conversion in that locale. On Fri, Aug 15, 2014 at 9:54 AM, Alexander Pyhalov <[email protected]> wrote: > On 08/15/2014 20:44, Garrett D'Amore via illumos-discuss wrote: > >> iconv looks busted to me. >> >> Can you convert using iconv() UTF-8 to UCS-4? If you can't, then its >> entirely an iconv() problem. >> >> If you're working with wchar_t (which is *not* guaranteed to be UCS-4, or >> anything else), you *must* use the libc wide character functions. The >> only >> reason to use iconv to work with UCS-4 strings is because that's an >> external format; do not use iconv when converting back and forth to >> wchar_t. (It *should* work if your locale is UTF-8, but its technically >> incorrect; the details of how wchar_t's are encoded is an operating system >> implementation detail.) >> >> Out of curiosity, why are you working with UCS-4/UTF-32? I rarely see >> this >> content. >> > > Perhaps, I just don't understand iconv() work entirely and do some stupid > errors. Will have to spend more time on it... As for interest to UCS-4, I > just want to rewrite the following glibmm code (and similar) in something > working on Solaris/illumos: > > gsize n_bytes = 0; > const ScopedPtr<char> buf ( > g_convert(reinterpret_cast<const char*>(str.data()), > str.size() * sizeof(std::wstring::value_type), > "UTF-8", "WCHAR_T", 0, &n_bytes, &error)); > > it ends up calling iconv_open(hndl,"UTF-8", "WCHAR_T") and failing there > (as UTF-8 <=> WCHAR_T conversion is unsupported on Solaris/illumos). GNU > libiconv handles such conversion. > > > -- > Best regards, > Alexander Pyhalov, > system administrator of Computer Center of Southern Federal University > ------------------------------------------- illumos-discuss Archives: https://www.listbox.com/member/archive/182180/=now RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be Modify Your Subscription: https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4 Powered by Listbox: http://www.listbox.com
