-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/21/2012 01:34 PM, Eric Blake wrote: > On 02/20/2012 07:42 PM, Chet Ramey wrote: >> On 2/18/12 5:39 AM, John Kearney wrote: >> >>> Bash Version: 4.2 Patch Level: 10 Release Status: release >>> >>> Description: Current u32toutf8 only encode values below 0xffff >>> correctly. wchar_t can be ambiguous size better in my opinion >>> to use unsigned long, or uint32_t, or something clearer. >> >> Thanks for the patch. It's good to have a complete >> implementation, though as a practical matter you won't see UTF-8 >> characters longer than four bytes. I agree with you about the >> unsigned 32-bit int type; wchar_t is signed, even if it's 32 >> bits, on several systems I use. > > Not only can wchar_t can be either signed or unsigned, you also > have to worry about platforms where it is only 16 bits, such as > cygwin; on the other hand, wint_t is always 32 bits, but you still > have the issue that it can be either signed or unsigned. > signed / unsigend isn't really the problem anyway utf-8 only encodes up to 0x7fff ffff and utf-16 only encodes up to 0x0010 ffff.
In my latest version I've pretty much removed all reference to wchar_t in unicode.c. It was unnecessary. However I would be interested in something like utf16_t or uint16_t currently using unsigned short which is intelligent but works. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJPQ593AAoJEKUDtR0WmS05g0wH/RPQMl1mfUdJBfzv5QkUtVSG ibezTe3/b7/9h8SG3LLrv2FiPS+FtcCbE4n8tUror3V1BHomsQHZdlj/Zshi8W/n YDl5ac5nc0rrOlw+SJxyCAJl9vHeEAXavjGw8m0KUv/vn0tZyWNM0RYXc7tRxJU2 uqY7G5sGLUt8uGuswCmSmucKjoB7guiUbsmTR+OzgDgKxuuSeQBr6/oIImo721pk nI5TYdqerPGCIMJoYPeZChCBAZ/WhK9i3C3/SxKme4zWnjySaDw3NH0yfqFHl4Ts IIOT4fYpm0h62U76+NJSPGWfadTd8UL4A/Jy4I3IwUS+mflwdU0Pu2zmwb8I+Xk= =pkAF -----END PGP SIGNATURE-----