iconv looks busted to me. Can you convert using iconv() UTF-8 to UCS-4? If you can't, then its entirely an iconv() problem.
If you're working with wchar_t (which is *not* guaranteed to be UCS-4, or anything else), you *must* use the libc wide character functions. The only reason to use iconv to work with UCS-4 strings is because that's an external format; do not use iconv when converting back and forth to wchar_t. (It *should* work if your locale is UTF-8, but its technically incorrect; the details of how wchar_t's are encoded is an operating system implementation detail.) Out of curiosity, why are you working with UCS-4/UTF-32? I rarely see this content. On Fri, Aug 15, 2014 at 9:18 AM, Alexander Pyhalov <[email protected]> wrote: > On 08/15/2014 19:20, Garrett D'Amore wrote: > >> To get from wchar_t to multibyte string, you do wcstombs(). Note that the >> resulting output will only be UTF-8 if the locale is *.UTF-8. (If you're >> in a different locale, the multi-byte-string may well be in a different >> encoding. >> >> Visually, your code above looks OK, but I'm not sure what's wrong. Is it >> a >> bug in libiconv? My guess is so, since it seems that even if the >> encoding >> was invalid, it shouldn't just dump core. Instead it should return an >> error such as EILSEQ. >> >> Admittedly, I have less than perfect confidence in our libiconv >> implementation. >> >> > OK, wcstombs works. As application locale can be not UTF-8, I'd like to > use iconv later to convert result from application locale to UTF-8. > And receive one more core dump... > > int main() > { > char out[1024],res[1024]; > int ret; > wchar_t *in; > size_t inlen,outlen; > char *locale; > char *second_part; > size_t outsz=sizeof(out); > char *enc; > iconv_t hdl; > > in=L"Привет!"; > > locale=setlocale(LC_ALL,""); > second_part=strchr(locale,'.'); > if(second_part){ > enc=strdup(second_part+1); > } else { > enc=strdup(locale); > } > > if(enc){ > printf("enc is %s\n",enc); > hdl = iconv_open("UTF-8", enc); > if(hdl<0) { > perror("iconv_open"); > return -1; > } > ret=wcsrtombs((char*)out,&in,sizeof(out),NULL); > printf("ret is %d\n",ret); > out[ret+1]='\0'; > printf("result is %s\n",out); > ret=strlen(out); > iconv(hdl,&out,&ret,&res,&outlen); > printf("%s\n",res); > free(enc); > } > return 0; > } > > $ ./test_utf8_mbchar > enc is UTF-8 > ret is 13 > result is Привет! > Segmentation Fault (core dumped) > ... > > Core was generated by `./test_utf8_mbchar'. > Program terminated with signal 11, Segmentation fault. > #0 0xfedd06d5 in _icv_iconv () from /usr/lib/iconv/UTF-8%UTF-8.so > (gdb) bt > #0 0xfedd06d5 in _icv_iconv () from /usr/lib/iconv/UTF-8%UTF-8.so > > #1 0xfee7dc17 in iconv () from /lib/libc.so.1 > #2 0x080510c0 in main () > > > > -- > Best regards, > Alexander Pyhalov, > system administrator of Computer Center of Southern Federal University > ------------------------------------------- illumos-discuss Archives: https://www.listbox.com/member/archive/182180/=now RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be Modify Your Subscription: https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4 Powered by Listbox: http://www.listbox.com
