Re: xterm and wcwidth()

Matthieu Herrb Fri, 08 Mar 2019 11:26:55 -0800

On Fri, Mar 08, 2019 at 02:39:32PM +0100, Ingo Schwarze wrote:
> Hi,
> 
> Lauri Tirkkonen wrote on Fri, Mar 08, 2019 at 12:43:11PM +0200:
> 
> > I feel like xterm should just use the system wcwidth() to avoid these
> > mismatches, so rudimentary diff to do that below.
> 
> Absolutely, i strongly agree with that sentiment.
> 
> Having a local character width table in an application program is
> just wrong.  Archaic operating systems like Oracle Solaris 11 that
> have broken wcwidth(3) implementations should either fix their C
> library or go to hell.
> 
> I also agree with the details of the diff.  It is minimal in so far
> as it changes only one file, keeping the annoyance minimal when merging
> new xterm releases.  The function systemWcwidthOk() is horrific
> enough that deleting it outright makes sense, to make sure that it
> can never get called by accident.
> 
> Note that the command line options -cjk_width and -mk_width and the
> related resources remain broken.  But i'd say people using them get
> what they deserve.  The disruption to updates that would be caused
> by excising them from multiple files is probably better avoided.
> 
> Unchanged diff left in place below for easy reference.
> 
> OK to commit?


hi,

I would prefer a diff that just add a &&!defined(__OpenBSD__) to the
condition before the definition of systemWcwidthOk(). This will cause
less risk of conflicts in future updates and clearly show the
intention.

but either way ok matthieu@

>   Ingo
> 
> 
> diff --git a/app/xterm/util.c b/app/xterm/util.c
> index a3de4a005..9b6443683 100644
> --- a/app/xterm/util.c
> +++ b/app/xterm/util.c
> @@ -4980,61 +4980,6 @@ decode_keyboard_type(XtermWidget xw, XTERM_RESOURCE * 
> rp)
>  }
>  
>  #if OPT_WIDE_CHARS
> -#if defined(HAVE_WCHAR_H) && defined(HAVE_WCWIDTH)
> -/*
> - * If xterm is running in a UTF-8 locale, it is still possible to encounter
> - * old runtime configurations which yield incomplete or inaccurate data.
> - */
> -static Bool
> -systemWcwidthOk(int samplesize, int samplepass)
> -{
> -    wchar_t n;
> -    int oops = 0;
> -
> -    for (n = 21; n <= 25; ++n) {
> -     wchar_t code = (wchar_t) dec2ucs(NULL, (unsigned) n);
> -     int system_code = wcwidth(code);
> -     int intern_code = mk_wcwidth(code);
> -
> -     /*
> -      * Solaris 10 wcwidth() returns "2" for all of the line-drawing (page
> -      * 0x2500) and most of the geometric shapes (a few are excluded, just
> -      * to make it more difficult to use).  Do a sanity check to avoid using
> -      * it.
> -      */
> -     if ((system_code < 0 && intern_code >= 1)
> -         || (system_code >= 0 && intern_code != system_code)) {
> -         TRACE(("systemWcwidthOk: broken system line-drawing wcwidth\n"));
> -         oops += (samplepass + 1);
> -         break;
> -     }
> -    }
> -
> -    for (n = 0; n < (wchar_t) samplesize; ++n) {
> -     int system_code = wcwidth(n);
> -     int intern_code = mk_wcwidth(n);
> -
> -     /*
> -      * When this check was originally implemented, there were few if any
> -      * libraries with full Unicode coverage.  Time passes, and it is
> -      * possible to make a full comparison of the BMP.  There are some
> -      * differences: mk_wcwidth() marks some codes as combining and some
> -      * as single-width, differing from GNU libc.
> -      */
> -     if ((system_code < 0 && intern_code >= 1)
> -         || (system_code >= 0 && intern_code != system_code)) {
> -         TRACE((".. width(U+%04X) = %d, expected %d\n",
> -                (unsigned) n, system_code, intern_code));
> -         if (++oops > samplepass)
> -             break;
> -     }
> -    }
> -    TRACE(("systemWcwidthOk: %d/%d mismatches, allowed %d\n",
> -        oops, (int) n, samplepass));
> -    return (oops <= samplepass);
> -}
> -#endif /* HAVE_WCWIDTH */
> -
>  void
>  decode_wcwidth(XtermWidget xw)
>  {
> @@ -5045,8 +4990,7 @@ decode_wcwidth(XtermWidget xw)
>      switch (mode) {
>      default:
>  #if defined(HAVE_WCHAR_H) && defined(HAVE_WCWIDTH)
> -     if (xtermEnvUTF8() &&
> -         systemWcwidthOk(xw->misc.mk_samplesize, xw->misc.mk_samplepass)) {
> +     if (xtermEnvUTF8()) {
>           my_wcwidth = wcwidth;
>           TRACE(("using system wcwidth() function\n"));
>           break;

-- 
Matthieu Herrb

Re: xterm and wcwidth()

Reply via email to