Re: xterm and wcwidth()

Ingo Schwarze Fri, 08 Mar 2019 05:40:15 -0800

Hi,

Lauri Tirkkonen wrote on Fri, Mar 08, 2019 at 12:43:11PM +0200:


> I feel like xterm should just use the system wcwidth() to avoid these
> mismatches, so rudimentary diff to do that below.

Absolutely, i strongly agree with that sentiment.

Having a local character width table in an application program is
just wrong.  Archaic operating systems like Oracle Solaris 11 that
have broken wcwidth(3) implementations should either fix their C
library or go to hell.

I also agree with the details of the diff.  It is minimal in so far
as it changes only one file, keeping the annoyance minimal when merging
new xterm releases.  The function systemWcwidthOk() is horrific
enough that deleting it outright makes sense, to make sure that it
can never get called by accident.

Note that the command line options -cjk_width and -mk_width and the
related resources remain broken.  But i'd say people using them get
what they deserve.  The disruption to updates that would be caused
by excising them from multiple files is probably better avoided.

Unchanged diff left in place below for easy reference.

OK to commit?
  Ingo


diff --git a/app/xterm/util.c b/app/xterm/util.c
index a3de4a005..9b6443683 100644
--- a/app/xterm/util.c
+++ b/app/xterm/util.c
@@ -4980,61 +4980,6 @@ decode_keyboard_type(XtermWidget xw, XTERM_RESOURCE * rp)
 }
 
 #if OPT_WIDE_CHARS
-#if defined(HAVE_WCHAR_H) && defined(HAVE_WCWIDTH)
-/*
- * If xterm is running in a UTF-8 locale, it is still possible to encounter
- * old runtime configurations which yield incomplete or inaccurate data.
- */
-static Bool
-systemWcwidthOk(int samplesize, int samplepass)
-{
-    wchar_t n;
-    int oops = 0;
-
-    for (n = 21; n <= 25; ++n) {
-       wchar_t code = (wchar_t) dec2ucs(NULL, (unsigned) n);
-       int system_code = wcwidth(code);
-       int intern_code = mk_wcwidth(code);
-
-       /*
-        * Solaris 10 wcwidth() returns "2" for all of the line-drawing (page
-        * 0x2500) and most of the geometric shapes (a few are excluded, just
-        * to make it more difficult to use).  Do a sanity check to avoid using
-        * it.
-        */
-       if ((system_code < 0 && intern_code >= 1)
-           || (system_code >= 0 && intern_code != system_code)) {
-           TRACE(("systemWcwidthOk: broken system line-drawing wcwidth\n"));
-           oops += (samplepass + 1);
-           break;
-       }
-    }
-
-    for (n = 0; n < (wchar_t) samplesize; ++n) {
-       int system_code = wcwidth(n);
-       int intern_code = mk_wcwidth(n);
-
-       /*
-        * When this check was originally implemented, there were few if any
-        * libraries with full Unicode coverage.  Time passes, and it is
-        * possible to make a full comparison of the BMP.  There are some
-        * differences: mk_wcwidth() marks some codes as combining and some
-        * as single-width, differing from GNU libc.
-        */
-       if ((system_code < 0 && intern_code >= 1)
-           || (system_code >= 0 && intern_code != system_code)) {
-           TRACE((".. width(U+%04X) = %d, expected %d\n",
-                  (unsigned) n, system_code, intern_code));
-           if (++oops > samplepass)
-               break;
-       }
-    }
-    TRACE(("systemWcwidthOk: %d/%d mismatches, allowed %d\n",
-          oops, (int) n, samplepass));
-    return (oops <= samplepass);
-}
-#endif /* HAVE_WCWIDTH */
-
 void
 decode_wcwidth(XtermWidget xw)
 {
@@ -5045,8 +4990,7 @@ decode_wcwidth(XtermWidget xw)
     switch (mode) {
     default:
 #if defined(HAVE_WCHAR_H) && defined(HAVE_WCWIDTH)
-       if (xtermEnvUTF8() &&
-           systemWcwidthOk(xw->misc.mk_samplesize, xw->misc.mk_samplepass)) {
+       if (xtermEnvUTF8()) {
            my_wcwidth = wcwidth;
            TRACE(("using system wcwidth() function\n"));
            break;

Re: xterm and wcwidth()

Reply via email to