On Sat, Nov 11, 2023 at 11:54:52PM +0100, Bruno Haible wrote: > [CCing bug-libunistring] > Gavin Smith wrote: > > I did not understand why uc_width was said to be "locale dependent": > > > > "These functions are locale dependent." > > > > - from > > <https://www.gnu.org/software/libunistring/manual/html_node/uniwidth_002eh.html#index-uc_005fwidth>. > > That's because some Unicode characters have "ambiguous width" — width 1 in > Western locales, width 2 is East Asian locales (for historic and font choice > reasons). > > > I also don't understand the purpose of the "encoding" argument -- can this > > always be "UTF-8"? > > Yes, it can be always "UTF-8"; then uc_width will always choose width 1 for > these characters. > > > I'm also unclear on the exact relationship between the types char32_t, > > ucs4_t and uint32_t. For example, uc_width takes a ucs4_t argument > > but u8_mbtouc writes to a char32_t variable. In the code I committed, > > I used a cast to ucs4_t when calling uc_width. > > These types are all identical. Therefore you don't even need to cast. > > - char32_t comes from <uchar.h> (ISO C 11 or newer). > - ucs4_t comes from GNU libunistring. > - uint32_t comes from <stdint.h>.
Thanks for the advice.