>>>>> "JC" == James Cloos <cl...@jhcloos.com> writes: >>>>> "AC" == Alan Coopersmith <alan.coopersm...@oracle.com> writes:
AC> Unfortunately, I couldn't find any existing XCB API for COMPOUND_TEXT AC> decoding, and reading the spec made my head spin and I had to go lay AC> down before I puked. I can understand that! ISO 2022 is, er, /interesting/. AC> The libX11 code doesn't seem to be a simple function we can copy, but AC> tied into the whole Xlib i18n module system, so may not help much. AC> If we did ship with this regression, would it actually cause critical AC> problems? Hopefully nothing is trying to parse xwininfo output to get AC> names instead of just getting the properties themselves. I looked at the Emacs src to prepare a patch to add support for the _NET NAME props. It will be even easier than I expected. Given that, the difficulty of supporting COMPOUNT_TEXT in xcb and presuming that xwininfo(1) is only used by people and not parsed, I have to agree that we can live with the regression. The only (possible) issue left is that, xwininfo will sometimes fail to print the WM_NAME at all. In xterm, which sets WM_NAME(STRING) and WM_LOCALE_NAME(STRING), with LOCALE set to en_US.UTF-8, I get this from a quick test: :; for ij in á ȩ 金;do printf "\033]0;${ij}\a" && xprop -id 0x1800010|grep NAME;done WM_LOCALE_NAME(STRING) = "en_US.UTF-8" WM_ICON_NAME(STRING) = "á" WM_NAME(STRING) = "á" WM_LOCALE_NAME(STRING) = "en_US.UTF-8" WM_ICON_NAME(STRING) = "È©" WM_NAME(STRING) = "È©" WM_LOCALE_NAME(STRING) = "en_US.UTF-8" WM_ICON_NAME(STRING) = "é??" WM_NAME(STRING) = "é??" :; for ij in á ȩ 金;do printf "\033]0;${ij}\a" && ./xwininfo -id 0x1800010|grep ^x;done xwininfo: Window id: 0x1800010 "á" xwininfo: Window id: 0x1800010 "ȩ" xwininfo: Window id: 0x1800010 " In urxvt, which set the _NET props, I get: :; for ij in á ȩ 金;do printf "\033]0;${ij}\a" && ./xwininfo -id 0x600022|grep ^x;done xwininfo: Window id: 0x600022 "á" xwininfo: Window id: 0x600022 "ȩ" xwininfo: Window id: 0x600022 "金" as expected. It may be that xterm sends bogus utf8 in the third case; xwininfo may just need to detect bad utf8. And I cannot tell from the code or the commit log how xwininfo guesses in the first two cases that the STRING is actually UTF-8. Is it just because xwininfo’s locale is .UTF-8? In any case, perhaps it should do something better when the UTF-8 is not valid? That is handles the first two cases is a welcome progression, btw. AC> If someone has a good way to solve this problem or wants to sign up AC> to write an xcb-util library for COMPOUND_TEXT encoding/decoding, great, AC> but I don't think I'll be solving it. JC> On a related note, we should make xprop(1) report UTF8_STRING props JC> using code similar to what you added here, falling back to the current JC> output when conversion to the locale cannot work. AC> Is this sort of thing common enough to make it worthwhile to have a AC> xcb/util library for property encoding/decoding? Certainly Xlib AC> handled this for you and hid it from applications, though it built AC> the huge xlibi18n infrastructure around it. I do think a simple set of routines would be usefule. I wanted to suggest building on the <wchar.h> api, adding support for compund_text. But the wchar api is itself a royal pain. I’m left suggesting just xcb_utf8_to_compound_text() and xcb_compound_text_to_utf8(). The compund text side would be just an (octet?) array; the utf8 side should be a tuple of an octet array and a token representing the preferred script (to deal with 2022 vs 10646 (dis-)unification). But comments on that are extremely welcome! -JimC -- James Cloos <cl...@jhcloos.com> OpenPGP: 1024D/ED7DAEA6 _______________________________________________ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel