Hi, Anton Lindqvist wrote on Sun, May 28, 2017 at 06:07:00PM +0200: > On Sun, May 28, 2017 at 10:56:19AM +0200, Walter Alejandro Iglesias wrote:
>> There is still a similar issue when you try to "replace" a utf-8 >> character (in command mode press 'r' to replace a single character or >> 'R' to replace a string). > Thanks for the report, please try out the diff below. > As I understand the problem: the current code assumes that the character > to replace consists of a single byte, which is not true for Unicode > characters. Correct. That needs to be improved. > When replacing such a character, delete the continuation > bytes and then replace the start byte with the replacement. > This ensures no continuation bytes are left behind. > I made use of putbuf() since it has the side-effect of advancing the > cursor. > Lastly, adjust the cursor to be positioned on the last replaced > character. > > NUL-terminating the line buffer is necessary in order for the following > to work: > > 1. Insert รถ > > 2. Press esc, h (back one char), ro (replace with o), ax (append x) > > Note that replacing a character with a Unicode character does not work > either. > > Comments? OK? > > Index: bin/ksh/vi.c > =================================================================== > RCS file: /cvs/src/bin/ksh/vi.c,v > retrieving revision 1.45 > diff -u -p -r1.45 vi.c > --- bin/ksh/vi.c 28 May 2017 07:27:01 -0000 1.45 > +++ bin/ksh/vi.c 28 May 2017 15:59:59 -0000 > @@ -926,13 +926,22 @@ vi_cmd(int argcnt, const char *cmd) > if (cmd[1] == 0) > vi_error(); > else { > - int n; > - > if (es->cursor + argcnt > es->linelen) > return -1; These two lines are no longer accurate. They try to make sure there are enough characters under and to the right of the cursor to match the number you want to replace (for example, with "2r"), and beep otherwise - but they count bytes, which is wrong. To catch the error condition of an excessive argument, i think you first need to iterate to the right, using the c1 variable and isu8cont(), and return -1 if you hit the end prematurely. Do not change anything in that case. If so far, you succeed, you know you have to replace the range [es->cursor, c1]. > - for (n = 0; n < argcnt; ++n) > - es->cbuf[es->cursor + n] = cmd[1]; > - es->cursor += n - 1; > + > + while (argcnt-- > 0) { > + for (cur = es->cursor + 1; > + cur < es->linelen; cur++) > + if (!isu8cont(es->cbuf[cur])) > + break; > + if (cur > 1) > + del_range(es->cursor, cur - 1); Given that you don't know the length (in bytes) of the character to insert yet, i think it may be simpler to delete the byte under the cursor as well, even though that is slightly inefficient for the ASCII case. > + putbuf(&cmd[1], 1, 1); It seems that here, you may need to measure the length of the character to insert in bytes and then call something like putbuf(cmd + 1, #bytes, 0); My impression is that the 's' command is likely also affected, but that can be fixed in a separate patch. Yours, Ingo