The point is, such a conversion is utterly meaningless.  The code that is
trying to do this is *wrong*.

I don't have sufficient context to understand why it's doing this, but I
*suspect* that the more correct code is to just convert this to a multibyte
string, rather than to UTF-8 specifically.  (In UTF-8 locales, the
multi-byte output will actually be UTF-8.  If you're in another locale...
its of course different.)

If I'm right, the function should be replaced with a call to wcstombs() (or
one of the variants, such as wcstombs_r or wcstombs_l.)  In fact, if you
*know* the incoming data is UCS-4, you *could* use newlocale() to get a
UTF-8 locale object (en_US.UTF-8 is your best bet, its probably installed
on pretty much every system), and then use wcsrtombs_l() to do a conversion
in that locale.


On Fri, Aug 15, 2014 at 9:54 AM, Alexander Pyhalov <[email protected]> wrote:

> On 08/15/2014 20:44, Garrett D'Amore via illumos-discuss wrote:
>
>> iconv looks busted to me.
>>
>> Can you convert using iconv() UTF-8 to UCS-4?  If you can't, then its
>> entirely an iconv() problem.
>>
>> If you're working with wchar_t (which is *not* guaranteed to be UCS-4, or
>> anything else), you *must* use the libc wide character functions.  The
>> only
>> reason to use iconv to work with UCS-4 strings is because that's an
>> external format; do not use iconv when converting back and forth to
>> wchar_t.  (It *should* work if your locale is UTF-8, but its technically
>> incorrect; the details of how wchar_t's are encoded is an operating system
>> implementation detail.)
>>
>> Out of curiosity, why are you working with UCS-4/UTF-32?  I rarely see
>> this
>> content.
>>
>
> Perhaps, I just don't understand iconv() work entirely and do some stupid
> errors. Will have to spend more time on it... As for interest to UCS-4, I
> just want to rewrite the following glibmm code (and similar) in something
> working on Solaris/illumos:
>
>   gsize n_bytes = 0;
>   const ScopedPtr<char> buf (
>               g_convert(reinterpret_cast<const char*>(str.data()),
>                       str.size() * sizeof(std::wstring::value_type),
>                       "UTF-8", "WCHAR_T", 0, &n_bytes, &error));
>
> it ends up calling iconv_open(hndl,"UTF-8", "WCHAR_T") and failing there
> (as UTF-8 <=> WCHAR_T conversion is unsupported on Solaris/illumos). GNU
> libiconv handles such conversion.
>
>
> --
> Best regards,
> Alexander Pyhalov,
> system administrator of Computer Center of Southern Federal University
>



-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Reply via email to