That is:

- libedit has a wchar_t* buffer (el->el_line.buffer) and el_line calls
  ct_encode_string to convert it to a char*.

- ct_encode_string calls wctomb which it expects to make UTF-8 but in
  fact because setlocale has not been called it outputs ASCII.

- el_line then uses ct_enc_width which assumes UTF-8 and returns 2. So
  the offset is adjusted by 2 even though only 1 byte was filled in.

- ftp obviously isn't happy about having a position after a \0, so it
  goes boom.

The setlocale() change below will only fix the problem if LC_CTYPE or
LC_ALL is set to UTF-8. ftp still cores if pasting UTF-8 in C locale.

I think the right fix is for libedit to use the return value of wctomb
to adjust the offset rather than assuming UTF-8 and working out the
width itself.

Perhaps something like this (very lightly tested):

Index: chartype.c
===================================================================
RCS file: /cvs/src/lib/libedit/chartype.c,v
retrieving revision 1.4
diff -u -p -r1.4 chartype.c
--- chartype.c  17 Nov 2011 20:14:24 -0000      1.4
+++ chartype.c  31 Oct 2012 00:13:12 -0000
@@ -44,6 +44,8 @@
 #define CT_BUFSIZ 1024
 
 #ifdef WIDECHAR
+protected ssize_t ct_encode_char1(char *, size_t, Char);
+
 protected void
 ct_conv_buff_resize(ct_buffer_t *conv, size_t mincsize, size_t minwsize)
 {
@@ -178,27 +180,25 @@ ct_decode_argv(int argc, const char *arg
 protected size_t
 ct_enc_width(Char c)
 {
-       /* UTF-8 encoding specific values */
-       if (c < 0x80)
-               return 1;
-       else if (c < 0x0800)
-               return 2;
-       else if (c < 0x10000)
-               return 3;
-       else if (c < 0x110000)
-               return 4;
-       else
-               return 0; /* not a valid codepoint */
+       char s[MB_CUR_MAX];
+
+       return ct_encode_char1(s, sizeof s, c);
 }
 
 protected ssize_t
 ct_encode_char(char *dst, size_t len, Char c)
 {
-       ssize_t l = 0;
        if (len < ct_enc_width(c))
                return -1;
-       l = ct_wctomb(dst, c);
+       return ct_encode_char1(dst, len, c);
+}
 
+protected ssize_t
+ct_encode_char1(char *dst, size_t len, Char c)
+{
+       ssize_t l = 0;
+
+       l = ct_wctomb(dst, c);
        if (l < 0) {
                ct_wctomb_reset;
                l = 0;




On Tue, Oct 30, 2012 at 11:56:18PM +0000, Nicholas Marriott wrote:
> Hi
> 
> The buffer isn't zero-terminated, it's the result of calling wctomb to
> convert the internal wchar_t* that libedit has into a char*.
> 
> libedit works out the offset in el_line with ct_enc_width which rather
> foolishly makes the assumption that wctomb will convert to UTF-8, but
> ftp doesn't call setlocale so it just leaves it as ASCII.
> 
> Try this:
> 
> Index: main.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/ftp/main.c,v
> retrieving revision 1.85
> diff -u -p -r1.85 main.c
> --- main.c    26 Aug 2012 02:16:02 -0000      1.85
> +++ main.c    30 Oct 2012 23:52:34 -0000
> @@ -67,6 +67,7 @@
>  
>  #include <ctype.h>
>  #include <err.h>
> +#include <locale.h>
>  #include <netdb.h>
>  #include <pwd.h>
>  #include <stdio.h>
> @@ -90,6 +91,8 @@ main(volatile int argc, char *argv[])
>       char *outfile = NULL;
>       const char *errstr;
>       int dumb_terminal = 0;
> +
> +     setlocale(LC_CTYPE, "");
>  
>       ftpport = "ftp";
>       httpport = "http";
> 
> 
> 
> 
> 
> On Tue, Oct 30, 2012 at 10:31:16PM +0100, Otto Moerbeek wrote:
> > On Tue, Oct 30, 2012 at 10:17:12PM +0100, Otto Moerbeek wrote:
> > 
> > > On Tue, Oct 30, 2012 at 08:59:27PM +0100, Juan Francisco Cantero Hurtado 
> > > wrote:
> > > 
> > > > On Tue, Oct 30, 2012 at 09:31:58AM +0100, Otto Moerbeek wrote:
> > > > > On Mon, Oct 29, 2012 at 06:43:13PM +0100, Juan Francisco Cantero 
> > > > > Hurtado wrote:
> > > > > 
> > > > > > Chris Cappuccio sent me a mail saying he can't see the characters, 
> > > > > > only
> > > > > > a question mark.
> > > > > > 
> > > > > > I'm linking each character to their wikipedia page, so you can
> > > > > > copy-paste the character.
> > > > > > 
> > > > > > On Thu, Oct 25, 2012 at 05:07:34AM +0200, Juan Francisco Cantero 
> > > > > > Hurtado wrote:
> > > > > > > This afternoon I was downloading a tarball from a OpenBSD mirror. 
> > > > > > > I
> > > > > > > press the key "?" and after the tab key. ftp crashed with a 
> > > > > > > segfault.
> > > > > 
> > > > > Please also include your environment settings. It is likely locale
> > > > > plays a role here.
> > > > > 
> > > > > At least env | grep LC
> > > > > 
> > > > 
> > > > I've tried the bug in amd64 without locales and also with
> > > > LC_TIME="es_ES.ISO8859-1" LC_CTYPE="en_US.UTF-8".
> > > > 
> > > > The i386 system was a clean installation in a virtual machine.
> > > 
> > > I can now reproduce using a terminal that accepts more than just low 
> > > ascii.
> > > 
> > > What I see is that when complete() is called the cursor position in
> > > the EditLine struct is not what it is supposed to be, it points a
> > > couple of bytes beyond the terminating NUL while it is supposed to
> > > point to the NUL. That causes confusing in the scanner, getting the
> > > argument list count wrong.
> > 
> > Ehh, the buffer is not NUL terminated, but observation still holds:
> > the cursor position is a couple of bytes further than it
> > should be.
> > 
> > > 
> > > The root of the problem seems to be inside the editline lib.
> > > 
> > > Cc:ing nicm@, maybe he has a clue
> > > 
> > >   -Otto
> > >   
> > > 
> > > > 
> > > > > 
> > > > > > https://en.wikipedia.org/wiki/%C2%BA
> > > > > > > 
> > > > > > > Steps for reproduce:
> > > > > > > # ftp ftp.fr.openbsd.org
> > > > > > > user and password
> > > > > > > ascii art
> > > > > > > ftp> cd pub/Open?    <- Here press the tab key
> > > > > > https://en.wikipedia.org/wiki/%C2%BA
> > > > > > > segmentation fault (core dumped)  ftp ftp.fr.openbsd.org
> > > > > > > 
> > > > > > > It also crashes with the letter "?" and "?".
> > > > > > https://en.wikipedia.org/wiki/%C3%81
> > > > > > https://en.wikipedia.org/wiki/%C3%91
> > > > > > > 
> > > > > > > Tested in:
> > > > > > > - A snapshot from yesterday. i386. root account. console/ksh 
> > > > > > > without
> > > > > > >   locales.
> > > > > > > - A snapshot from a few days ago. amd64. user. urxvt/zsh with utf8
> > > > > > >   locales.
> > > > > > > 
> > > > > > > I also tested the bug in a remote session with OpenBSD 4.7 and 
> > > > > > > ftp works
> > > > > > > without problems.
> > > > > > > 
> > > > > > > I've updated the code of usr.bin/ftp to 2012-10-01 and 2012-01-01 
> > > > > > > and
> > > > > > > tried both versions. ftp also crashes.
> > > > > > > 
> > > > > > > Backtrace:
> > > > > > > Thread 1 (process 3436):
> > > > > > > #0  memcpy (dst0=0x9d4160, src0=Variable "src0" is not available.
> > > > > > > ) at /usr/src/lib/libc/string/bcopy.c:115
> > > > > > > #1  0x000000000040432b in complete (el=Variable "el" is not 
> > > > > > > available.
> > > > > > > ) at /usr/src/usr.bin/ftp/complete.c:313
> > > > > > > #2  0x000000000041eb84 in el_wgets (el=0x20da64800, 
> > > > > > > nread=0x7f7ffffe3ebc) at read.c:612
> > > > > > > #3  0x000000000041ef8d in el_gets (el=0x20da64800, nread=Variable 
> > > > > > > "nread" is not available.
> > > > > > > ) at eln.c:78
> > > > > > > #4  0x000000000040e55f in cmdscanner (top=Variable "top" is not 
> > > > > > > available.
> > > > > > > ) at /usr/src/usr.bin/ftp/main.c:465
> > > > > > > #5  0x000000000040eb7c in main (argc=1, argv=0x7f7ffffe4398) at 
> > > > > > > /usr/src/usr.bin/ftp/main.c:369
> > > > > > > 
> > > > > > > Let me know if it's necessary more info or whatever :)
> > > > > > > 
> > > > > > > Cheers.
> > > > > > > 
> > > > > > 
> > > > 
> > > > -- 
> > > > Juan Francisco Cantero Hurtado http://juanfra.info

Reply via email to