[CCing bug-libunistring] Gavin Smith wrote: > I did not understand why uc_width was said to be "locale dependent": > > "These functions are locale dependent." > > - from > <https://www.gnu.org/software/libunistring/manual/html_node/uniwidth_002eh.html#index-uc_005fwidth>.
That's because some Unicode characters have "ambiguous width" — width 1 in Western locales, width 2 is East Asian locales (for historic and font choice reasons). > I also don't understand the purpose of the "encoding" argument -- can this > always be "UTF-8"? Yes, it can be always "UTF-8"; then uc_width will always choose width 1 for these characters. > I'm also unclear on the exact relationship between the types char32_t, > ucs4_t and uint32_t. For example, uc_width takes a ucs4_t argument > but u8_mbtouc writes to a char32_t variable. In the code I committed, > I used a cast to ucs4_t when calling uc_width. These types are all identical. Therefore you don't even need to cast. - char32_t comes from <uchar.h> (ISO C 11 or newer). - ucs4_t comes from GNU libunistring. - uint32_t comes from <stdint.h>. Bruno