On Sun, Sep 28, 2008 at 4:35 PM, Tony Mechelynck wrote: > >> On Sun, Sep 28, 2008 at 9:40 AM, John Hughes wrote: >>> I am trying to write a command that substitutes some Ascii characters >>> with a Unicode character. The following substitution works when >>> entered directly: >>> >>> :%s/\.\.\./…/eg >>> >>> However, when defined as a command, it does not work: >>> >>> :com Ellipsis %s/\.\.\./…/eg >>> >>> The command :Ellipsis converts >>> >>> ... >>> >>> into >>> >>> â<80><fe>X¦ >>> >>> Why is this? Is there any way of using Unicode characters in >>> substitute commands? > > I'm using gvim 7.2.21, huge build with Gnome2 GUI and 'encoding' set to > UTF-8. Just like the OP, I see the following: > > - Typing the :s command at the command-line works OK. > - Defining that :s command as a user-command text, then running that > user command, replaces every set of three dots by â<80><fe>X¦ (5 > characters including two invalid UTF-8 sequences, 7 bytes viz. C3 A2 80 > FE 58 C2 A6). > - Recalling that command definition with ":command Ellipsis" displays > the ellipsis character as an ellipsis. > - The ellipsis is U+2026, in UTF-8 0xE2 0x80 0xA6. Notice that 80 and A6 > appear (though not consecutively) as part of the replace-text actually > used, and that E2 is C3 A2 which also appears. This makes me suspect > that Vim is applying a spurious Latin1-to-UTF8 conversion to what is > already UTF-8 (with something wrong, maybe buffer-overflow, happening in > the middle). Another possibility would be using a "character length" > instead of a "byte length", or vice-versa, at some point in the > user-command execution.
I can confirm this. It looks to me like it's not a spurious Latin1-UTF8 conversion, but an internally-escaped string that's not un-escaped before being used. Sourcediving, it seems that mb_unescape() is called to escape any multibyte characters when displaying the command, but that mb_unescape() is never called before the command is passed to do_cmdline() to be executed. That seems to explain why it's displayed properly but executed incorrectly. I don't completely follow all of the string escaping being done here, though, so Bram knows for sure. I've cross-posted to the vim-dev list accordingly. ~Matt --~--~---------~--~----~------------~-------~--~----~ You received this message from the "vim_dev" maillist. For more information, visit http://www.vim.org/maillist.php -~----------~----~----~----~------~----~------~--~---
