Re: ksh tab completion bug
Hi Anton, Thanks for providing those links to the past discussion. Anton Lindqvist wrote: > As I would address this, the numbers of arguments passed to the > completion related routines is already painfully^W long. I would rather > take a step back and introduce a `struct globstate' (just an example, > could be renamed) which includes all the necessary parameters. Getting > such refactoring in place first would make the diff even smaller. After thinking about this some more, I think the word length is not really the right thing to check, even if you account for escape characters. What we really want to know is whether or not the current word was modified by completing up to a common prefix. Here is another (slightly contrived) case where completion is not performed, even when backslashes are not involved: Say you have a user somelonguser, whose home directory is /a, and there are two files in that directory, /a/test1 and /a/test2. ksh will not complete ~somelonguser/t to /a/test since the expanded word is shorter than the original. This example works on vi completion for the same reason as before. I took a stab at changing do_complete to use a similar approach as vi completion (diff below), and it is a bit simpler (only modifies emacs.c and net removal). However, I still have some concerns: * There is a slight behavior change when you have test1 and test2, and manually type the entire common prefix. Previously, the options would be displayed on the first , but now they are only displayed on the second . * The use of a global variable to track whether the word was modified since the last completion is a bit messy. vi.c contains a bunch of `expanded = NONE` scattered throughout the file. It is quite possible that the two I added to emacs.c are not sufficient for some editing operations. Here's another idea for a fix that might address those issues: * After the call to x_cf_glob, save the part of the buffer between start and end in some temporary location. * Always replace start through end with the longest common prefix. * After replacement, compare the escaped word with the saved original word. If it matches exactly, print the replacement options. This would avoid the global variable, but I'm not sure how to determine the new end position after x_escape() since it just calls x_do_ins() directly. Though not quite related, I noticed that x_expand does not call x_free_words before it returns, so it leaks memory in ATEMP. But since that gets reclaimed after each command, it's probably not worth worrying about. > Also, emacs mode has its own regress suite located in > regress/bin/ksh/edit/emacs.sh. Tricky logic like this should be covered > in my opinion. Below is a rough diff how to test file name completion. I'm using OpenBSD's ksh through oksh, which does not contain the regress tests. I think I'd have to get an OpenBSD VM up and running in order to contribute a test case. > Michael, is this something you would like to continue work on? I would like to see this issue fixed, but now that I have a local patch that works for me, my motivation is dwindling. I'm happy to fix any details with my diff, but if there is substantial refactoring involved in order to merge, then probably not. I also have no particular attachment to the diffs I wrote. If anyone has their own idea about a better solution, feel free to go ahead with that. diff --git a/emacs.c b/emacs.c index 2b70992..2ec4fc4 100644 --- a/emacs.c +++ b/emacs.c @@ -130,6 +130,7 @@ static int x_arg_set; static char*macro_args; static int prompt_skip; static int prompt_redraw; +static int completed; static int x_ins(char *); static voidx_delete(int, int); @@ -460,6 +461,7 @@ x_ins(char *s) if (x_do_ins(s, strlen(s)) < 0) return -1; + completed = 0; /* * x_zots() may result in a call to x_adjust() * we want xcp to reflect the new position. @@ -566,7 +568,7 @@ x_delete(int nc, int push) for (cp = x_lastcp(); cp > xcp; ) x_bs(*--cp); - return; + completed = 0; } static int @@ -1749,7 +1751,6 @@ do_complete(int flags,/* XCF_{COMMAND,FILE,COMMAND_FILE} */ int nwords; int start, end, nlen, olen; int is_command; - int completed = 0; nwords = x_cf_glob(flags, xbuf, xep - xbuf, xcp - xbuf, &start, &end, &words, &is_command); @@ -1759,7 +1760,7 @@ do_complete(int flags,/* XCF_{COMMAND,FILE,COMMAND_FILE} */ return; } - if (type == CT_LIST) { + if (type == CT_LIST || type == CT_COMPLIST && completed) { x_print_expansions(nwords, words, is_command); x_redraw(0); x_free_words(nwords, words); @@ -1769,27 +1770,20 @@ do_complete(int flags, /* XCF_{COMMAND,FILE,COMMAND_FILE} */ olen = end - start; nlen = x_longest_prefix(nwords, words);
Re: ksh tab completion bug
On Tue, Nov 10, 2020 at 06:49:28PM +0100, Sven M. Hallberg wrote: > Apologies for jumping in as a bystander, but if I may comment: > > Anton Lindqvist on Tue, Nov 10 2020: > > Been brought up before[1] and rejected[2][3]. > > Anton Lindqvist on Sun, Jul 02, 2017: > > diff below in which slashes are discarded when comparing the length. I > > don't know if any other character should be discarded as well, if true > > then it might be worth passing the input buffer through ksh's own > > lexer and [...] > > The patch then seems to have been rejected for further complicating some > already hairy code and the above does sound rather unsafe. > > Michael's aim here appears to be to simplify things: > > Michael Forney on Mon, Nov 09, 2020: > > emacs.c:do_complete() using `end - start` for olen (which counts the > > backslashes), but nlen (the longest prefix length) does not. [...] > > > > I don't believe vi.c:complete_word() has this issue [...] > > > > [my diff] is longer than I had hoped. Perhaps there is a simpler patch > > to make emacs.c:do_complete() use the same approach as vi completion. > > Maybe some iteration is all that is needed. I not saying the diff is bad in anyway, just sharing a related data point. I do think this would be an improvement and the approach looks better than mine. As I would address this, the numbers of arguments passed to the completion related routines is already painfully^W long. I would rather take a step back and introduce a `struct globstate' (just an example, could be renamed) which includes all the necessary parameters. Getting such refactoring in place first would make the diff even smaller. Also, emacs mode has its own regress suite located in regress/bin/ksh/edit/emacs.sh. Tricky logic like this should be covered in my opinion. Below is a rough diff how to test file name completion. Michael, is this something you would like to continue work on? Index: emacs.sh === RCS file: /cvs/src/regress/bin/ksh/edit/emacs.sh,v retrieving revision 1.11 diff -u -p -r1.11 emacs.sh --- emacs.sh3 Apr 2019 14:59:34 - 1.11 +++ emacs.sh10 Nov 2020 19:53:30 - @@ -139,7 +139,11 @@ testseq "a\0033#" " # a\r # #a \0010\001 # XXX ^[^[: complete # XXX ^X^[: complete-command -# XXX ^[^X: complete-file + +# ^[^X: complete-file +: >abc +testseq "ls \0033\0030" " # abc" + # XXX ^I, ^[=: complete-list # [n] ERASE, ^?, ^H: delete-char-backward Index: subr.sh === RCS file: /cvs/src/regress/bin/ksh/edit/subr.sh,v retrieving revision 1.8 diff -u -p -r1.8 subr.sh --- subr.sh 21 Nov 2017 19:25:46 - 1.8 +++ subr.sh 10 Nov 2020 19:53:30 - @@ -18,7 +18,7 @@ testseq() { stdin=$1 exp=$(echo "$2") - act=$(echo -n "$stdin" | ./edit -p "$PS1" ${KSH:-/bin/ksh} -r 2>&1) + act=$(echo -n "$stdin" | "$EDIT" -p "$PS1" ${KSH:-/bin/ksh} -r 2>&1) [ $? = 0 ] && [ "$exp" = "$act" ] && return 0 dump input "$stdin" @@ -32,3 +32,9 @@ dump() { printf '%s:\n>>>%s<<<\n' "$1" "$(echo -n "$2" | vis -ol)" echo -n "$2" | hexdump -Cv } + +EDIT="${PWD}/edit" + +TMP="$(mktemp -dt ksh.XX)" +trap 'rm -r $TMP' EXIT +cd "$TMP"
Re: ksh tab completion bug
Apologies for jumping in as a bystander, but if I may comment: Anton Lindqvist on Tue, Nov 10 2020: > Been brought up before[1] and rejected[2][3]. Anton Lindqvist on Sun, Jul 02, 2017: > diff below in which slashes are discarded when comparing the length. I > don't know if any other character should be discarded as well, if true > then it might be worth passing the input buffer through ksh's own > lexer and [...] The patch then seems to have been rejected for further complicating some already hairy code and the above does sound rather unsafe. Michael's aim here appears to be to simplify things: Michael Forney on Mon, Nov 09, 2020: > emacs.c:do_complete() using `end - start` for olen (which counts the > backslashes), but nlen (the longest prefix length) does not. [...] > > I don't believe vi.c:complete_word() has this issue [...] > > [my diff] is longer than I had hoped. Perhaps there is a simpler patch > to make emacs.c:do_complete() use the same approach as vi completion. Maybe some iteration is all that is needed. Regards, pesco
Re: ksh tab completion bug
On Mon, Nov 09, 2020 at 11:15:36PM -0800, Michael Forney wrote: > I noticed some strange behavior of ksh in emacs mode when completing > file names that contain spaces (or other characters that need > escaping). > > To illustrate the problem, consider two files 'a b c test1' and > 'a b c test2'. ksh will correctly complete `a` and `a\ b\ c\ ` to > the common prefix `a\ b\ c\ test`. However, all of the following > do not get completed: > > * `a\ b\ c\ t` > * `a\ b\ c\ te` > * `a\ b\ c\ tes` > > I did some debugging, and I think this is due to emacs.c:do_complete() > using `end - start` for olen (which counts the backslashes), but > nlen (the longest prefix length) does not. So the completion only > occurs when the current word is shorter than the common prefix > length minus the number of backslashes in the word. > > I don't believe vi.c:complete_word() has this issue since it always > replaces the word with the longest prefix. > > I wrote a (only lightly tested) patch to make x_cf_glob return the > unescaped length as well as the start and end positions, and use > that for olen instead of `end - start`. This seems to fix the issue, > but it is longer than I had hoped. Perhaps there is a simpler patch > to make emacs.c:do_complete() use the same approach as vi completion. Been brought up before[1] and rejected[2][3]. [1] https://marc.info/?l=openbsd-bugs&m=149902839123905&w=2 [2] https://marc.info/?l=openbsd-bugs&m=149925960003395&w=2 [3] https://marc.info/?l=openbsd-bugs&m=149925991603581&w=2
ksh tab completion bug
I noticed some strange behavior of ksh in emacs mode when completing file names that contain spaces (or other characters that need escaping). To illustrate the problem, consider two files 'a b c test1' and 'a b c test2'. ksh will correctly complete `a` and `a\ b\ c\ ` to the common prefix `a\ b\ c\ test`. However, all of the following do not get completed: * `a\ b\ c\ t` * `a\ b\ c\ te` * `a\ b\ c\ tes` I did some debugging, and I think this is due to emacs.c:do_complete() using `end - start` for olen (which counts the backslashes), but nlen (the longest prefix length) does not. So the completion only occurs when the current word is shorter than the common prefix length minus the number of backslashes in the word. I don't believe vi.c:complete_word() has this issue since it always replaces the word with the longest prefix. I wrote a (only lightly tested) patch to make x_cf_glob return the unescaped length as well as the start and end positions, and use that for olen instead of `end - start`. This seems to fix the issue, but it is longer than I had hoped. Perhaps there is a simpler patch to make emacs.c:do_complete() use the same approach as vi completion. diff --git a/bin/ksh/edit.c b/bin/ksh/edit.c index 3089d195d20..d44afc258ba 100644 --- a/bin/ksh/edit.c +++ b/bin/ksh/edit.c @@ -29,7 +29,7 @@ static void check_sigwinch(void); static int x_file_glob(int, const char *, int, char ***); static int x_command_glob(int, const char *, int, char ***); -static int x_locate_word(const char *, int, int, int *, int *); +static int x_locate_word(const char *, int, int, int *, int *, int *); /* Called from main */ @@ -524,15 +524,16 @@ x_command_glob(int flags, const char *str, int slen, char ***wordsp) (c) == '`' || (c) == '=' || (c) == ':' ) static int -x_locate_word(const char *buf, int buflen, int pos, int *startp, +x_locate_word(const char *buf, int buflen, int pos, int *startp, int *endp, int *is_commandp) { int p; - int start, end; + int start, end, len; /* Bad call? Probably should report error */ if (pos < 0 || pos > buflen) { *startp = pos; + *endp = pos; *is_commandp = 0; return 0; } @@ -546,10 +547,13 @@ x_locate_word(const char *buf, int buflen, int pos, int *startp, (start > 1 && buf[start-2] == '\\'); start--) ; /* Go forwards to end of word */ + len = 0; for (end = start; end < buflen && IS_WORDC(buf[end]); end++) { if (buf[end] == '\\' && (end+1) < buflen) end++; + len++; } + *endp = end; if (is_commandp) { int iscmd; @@ -574,7 +578,7 @@ x_locate_word(const char *buf, int buflen, int pos, int *startp, *startp = start; - return end - start; + return len; } static int @@ -654,14 +658,14 @@ x_try_array(const char *buf, int buflen, const char *want, int wantlen, int x_cf_glob(int flags, const char *buf, int buflen, int pos, int *startp, -int *endp, char ***wordsp, int *is_commandp) +int *endp, int *lenp, char ***wordsp, int *is_commandp) { - int len; + int len, esclen; int nwords; char **words = NULL; int is_command; - len = x_locate_word(buf, buflen, pos, startp, &is_command); + len = x_locate_word(buf, buflen, pos, startp, endp, &is_command); if (!(flags & XCF_COMMAND)) is_command = 0; /* Don't do command globing on zero length strings - it takes too @@ -671,10 +675,11 @@ x_cf_glob(int flags, const char *buf, int buflen, int pos, int *startp, if (len == 0 && is_command) return 0; + esclen = *endp - *startp; if (is_command) - nwords = x_command_glob(flags, buf + *startp, len, &words); - else if (!x_try_array(buf, buflen, buf + *startp, len, &nwords, &words)) - nwords = x_file_glob(flags, buf + *startp, len, &words); + nwords = x_command_glob(flags, buf + *startp, esclen, &words); + else if (!x_try_array(buf, buflen, buf + *startp, esclen, &nwords, &words)) + nwords = x_file_glob(flags, buf + *startp, esclen, &words); if (nwords == 0) { *wordsp = NULL; return 0; @@ -683,7 +688,7 @@ x_cf_glob(int flags, const char *buf, int buflen, int pos, int *startp, if (is_commandp) *is_commandp = is_command; *wordsp = words; - *endp = *startp + len; + *lenp = len; return nwords; } diff --git a/bin/ksh/edit.h b/bin/ksh/edit.h index 0b604cd64fb..f988d92c4b2 100644 --- a/bin/ksh/edit.h +++ b/bin/ksh/edit.h @@ -43,7 +43,7 @@ bool x_mode(bool); intpromptlen(const char *, const char **); intx_do_comment(char *, int, int *); void x_print_expansions(int, char *const *, i