Re: Changing encoding of an already loaded buffer
On 10/12/2020 14.04, A. Wik wrote: I just like to keep things "8-bit clean". As long as all tools used to process the files are also 8-bit clean, nothing gets corrupted. Alas, it does mean files are sometimes displayed incorrectly. But in my experience, it gets messy when I introduce UTF-8. Ok, my experience instead is that a lot of tools do mess up the encodings and its hard to promptly recognize those mess-ups when not using a UTF encoding. I guess it comes up to one's usual tools, needs and habits. There is something to it. People who use only ASCII seem to like UTF-8 better than those who frequently use non-English characters. I've seen claims that UTF-8 is "compact" but compared to strictly 8-bit character sets like Latin-1 it is not. Maybe that was in the first years of UTF-8, now several tests showed that UTF-8 is fairly efficient even for asian languages, so I think it's generally well accepted and the controversy is just on the BOM. Anyway I don't think anyone who needs non-english characters has ever favoured any old non-unicode encoding, Unicode is a bliss precisely for them. -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/dee925c6-31e9-4d3d-4a9d-83a1f4f20070%40tiscali.it.
Re: Changing encoding of an already loaded buffer
On 10/12/2020 5.58, Tony Mechelynck wrote: The problem with ":setg fenc=utf8 bomb" is that *every* new text file will start with 0xEF 0xBB 0xBF unless you explicitly turn it off for that file by means of ":setl nobomb" or ":setl fenc=latin1" or similar before writing it. That's the point, indeed For C sources this wil confuse the compiler (generating an error and preventing successful compilation) and for anything starting with a shebang (shell scripts, perl sources, etc.) it will prevent the #! shebang leader from being recognized. OTOH for It's true, it depends on what you most do in the editor, if you need to frequently create files that cannot have a BOM in them, it's most likely inconvenient. Maybe use more than one editor, or aliases with different configurations...? I indeed personally use text editors mostly for normal textual or web files, use mostly IDEs for programming, rarely edit shell scripts, and it actually may well be that I usually left bomb disabled when using unices... Anyway, for textual files or filetypes that do support the BOM, I believe it's more beneficial to include it, and that it should not be discouraged. -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/dcc46e3c-f5d5-3c80-e6d7-10a8e13be7aa%40tiscali.it.
Re: Changing encoding of an already loaded buffer
On 09/12/2020 21.19, Gabriele F wrote: Completely off-topic, if you don't have particular needs I'd advise you to use UTF- 8 with BOMs for all your new files ('set bomb', 'set encoding=utf-8' and 'fenc' left to the default in your vimrc), it will prevent any future encoding problem for at least them. I've been doing so for more than a decade and pretty much never had problems, and sigh a relief every time I see I'm working with one of them. I should have specified that in that time I used mostly other text editors, and on Windows, I've been using Vim only for a few years and I still use more frequently other editors. Although I do have a "set bomb" in my vimrc, I have less experience with it in Vim, and still am on Windows most of the time. -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/4ffef886-03cf-6eb1-4ace-66f86afe6774%40tiscali.it.
Re: Changing encoding of an already loaded buffer
On Thu, 10 Dec 2020 at 15:04, A. Wik wrote: > > On Wed, 9 Dec 2020 at 20:20, Gabriele F wrote: > .. > > I imagine most of the critics are from countries that never needed more > > than ASCII > > There is something to it. People who use only ASCII seem to like > UTF-8 better than those who frequently use non-English characters. > I've seen claims that UTF-8 is "compact" but compared to strictly > 8-bit character sets like Latin-1 it is not. To people who use only ASCII the distinction between ASCII and UTF-8 is totally irrelevant, because in their case UTF-8 is precisely ASCII by definition. But people like me, who regularly use scripts other than Latin, and who also like to indulge themselves with mathematical and other ‘special’ characters in plain text – they are those who really appreciate and praise the advent of Unicode and UTF-8. -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/CALdOZq%3DTObU-DcO1Jvt9P6yxGKraq8c8mVO6d565rt8ZGd0Wfw%40mail.gmail.com.
Re: Changing encoding of an already loaded buffer
On 09/12/2020 20.35, Gabriele F wrote: That :%!cat is indeed a neat (if hacky) idea! It should be noted that it works only as long as the 'shelltemp' option is on though, which is the default. 'shelltemp' makes Vim use a temporary file for the filtering instead of a pipe, which is evidently the (probably accidental) cause of the effects on the encoding. -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/3cd5d0bd-4c71-09cf-3368-81e4167f79de%40tiscali.it.
Re: Changing encoding of an already loaded buffer
I should add that those tests were all made with 'encoding' set in my vimrc to utf-8, I haven't tried with the default latin1 or other values. I don't know if this influenced something. That's the setting that A. Wik said to have as well, anyway. -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/7074f3f9-22e0-273c-41fc-34f9dc428704%40tiscali.it.
Re: Changing encoding of an already loaded buffer
On Thu, Dec 10, 2020 at 2:04 PM A. Wik wrote: > > On Wed, 9 Dec 2020 at 20:20, Gabriele F wrote: > > > > On 09/12/2020 18.47, A. Wik wrote: > > > I don't include utf8 in my default fencs setting because that has the > > > side effect of using utf8 for any newly created files. > > > > Completely off-topic, if you don't have particular needs ... > > I just like to keep things "8-bit clean". As long as all tools used > to process the files are also 8-bit clean, nothing gets corrupted. > Alas, it does mean files are sometimes displayed incorrectly. But in > my experience, it gets messy when I introduce UTF-8. > > > I imagine most of the critics are from countries that never needed more > > than ASCII > > There is something to it. People who use only ASCII seem to like > UTF-8 better than those who frequently use non-English characters. > I've seen claims that UTF-8 is "compact" but compared to strictly > 8-bit character sets like Latin-1 it is not. > > -aw - For pure 7-bit ASCII, all three of us-ascii, Latin1 and UTF-8 are equivalent, they represent the data identically. - For "Western Latin" (French, Spanish, etc.) Latin1 is slightly more economical than UTF-8. How much more depends on the percent abundance of accented letters not found in ASCII. - When mixing several scripts (at least two of Latin, Greek, Cyrillic, Hebrew, Arabic, CJK ideographic, etc.) within a single document, I know no better encoding than UTF-8. In an 8-bit charset like Latin1 you have only (at most) 256 different valid character values, and that is much too few as soon as you start mixing scripts: be it for a juxtalinear edition of the Bible (with the original Hebrew, Aramaic or Greek text next to a translation and/or commentary) or for a Greek-Russian or Russian-Finnish dictionary. And of course even for a single CJK script, no 8-bit script can do the job. Best regards, Tony. -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/CAJkCKXt2bHw0RfJ6yfOBX%3D7%3DErBV0nPtUK--V0tP%2B6Og%3DONTHg%40mail.gmail.com.
Re: Changing encoding of an already loaded buffer
On Wed, 9 Dec 2020 at 20:20, Gabriele F wrote: > > On 09/12/2020 18.47, A. Wik wrote: > > I don't include utf8 in my default fencs setting because that has the > > side effect of using utf8 for any newly created files. > > Completely off-topic, if you don't have particular needs ... I just like to keep things "8-bit clean". As long as all tools used to process the files are also 8-bit clean, nothing gets corrupted. Alas, it does mean files are sometimes displayed incorrectly. But in my experience, it gets messy when I introduce UTF-8. > I imagine most of the critics are from countries that never needed more > than ASCII There is something to it. People who use only ASCII seem to like UTF-8 better than those who frequently use non-English characters. I've seen claims that UTF-8 is "compact" but compared to strictly 8-bit character sets like Latin-1 it is not. -aw -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/CALPW7mTLvRvds6gHuL1%3Du2BYcqaL1HgL_aPFsLY05vryPZNotg%40mail.gmail.com.
Re: Changing encoding of an already loaded buffer
On Wed, Dec 9, 2020 at 9:20 PM Gabriele F wrote: > > On 09/12/2020 18.47, A. Wik wrote: > > I don't include utf8 in my default fencs setting because that has the > > side effect of using utf8 for any newly created files. > > Completely off-topic, if you don't have particular needs I'd advise you > to use UTF- 8 with BOMs for all your new files ('set bomb', 'set > encoding=utf-8' and 'fenc' left to the default in your vimrc), it will > prevent any future encoding problem for at least them. > > I've been doing so for more than a decade and pretty much never had > problems, and sigh a relief every time I see I'm working with one of them. > > I heard many protest the BOMs in UTF-8, but they are the first thing > ever to allow a reliable encoding detection and they solve a lot more > problems than they can cause (if they cause problems they usually do so > immediately and noticeably, much better than discovering years later > that you irremediably botched the encoding of some file). So I find it > absurd to disparage them, and delusive to think that we'll ever get to a > point when non-utf8 files will be rare enough that we won't need to > handle them. > I imagine most of the critics are from countries that never needed more > than ASCII IIUC the critics are from people who do a lot of programming, either in C (where sources are supposed to be in Latin1; they may be in UTF-8 if characters above U+007F are used only in alphanumeric literals, but they cannot start with a BOM) or in Perl, Python, Unix shell script language, etc. (where the first two bytes of a source file must be #! in that order): The problem with ":setg fenc=utf8 bomb" is that *every* new text file will start with 0xEF 0xBB 0xBF unless you explicitly turn it off for that file by means of ":setl nobomb" or ":setl fenc=latin1" or similar before writing it. For C sources this wil confuse the compiler (generating an error and preventing successful compilation) and for anything starting with a shebang (shell scripts, perl sources, etc.) it will prevent the #! shebang leader from being recognized. OTOH for "well-behaved" filetypes like Vim scripts (if not run by means of a shebang), HTML pages, CSS style sheets, etc., there is no problem. So whether or not to set it should depend on what types of files you write most often. I use it because most of the files I write are HTML or CSS, followed by Vim scripts; but then when I write a shell script I have to remember to turn the 'bomb' setting off for that file. Best regards, Tony. -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/CAJkCKXtbAtoj%2BU0EfF-oudbmoMng5nt2AbZZUi%2B7N6HayrwqmA%40mail.gmail.com.
Re: Changing encoding of an already loaded buffer
On 09/12/2020 18.47, A. Wik wrote: I don't include utf8 in my default fencs setting because that has the side effect of using utf8 for any newly created files. Completely off-topic, if you don't have particular needs I'd advise you to use UTF- 8 with BOMs for all your new files ('set bomb', 'set encoding=utf-8' and 'fenc' left to the default in your vimrc), it will prevent any future encoding problem for at least them. I've been doing so for more than a decade and pretty much never had problems, and sigh a relief every time I see I'm working with one of them. I heard many protest the BOMs in UTF-8, but they are the first thing ever to allow a reliable encoding detection and they solve a lot more problems than they can cause (if they cause problems they usually do so immediately and noticeably, much better than discovering years later that you irremediably botched the encoding of some file). So I find it absurd to disparage them, and delusive to think that we'll ever get to a point when non-utf8 files will be rare enough that we won't need to handle them. I imagine most of the critics are from countries that never needed more than ASCII -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/a7b20b97-cfc7-a2d6-d2a3-744a438199a5%40tiscali.it.
Re: Changing encoding of an already loaded buffer
On 08/12/2020 17.47, Bram Moolenaar wrote: This works: :set fencs=utf8 :%!cat although "fenc" remains "latin1". Yeah, for an existing buffer and filtering the first entry in 'fencs' is used to read the filter output, but 'fenc' isn't set. That's a bit strange, but I'm not sure what would break if we change this. It might actually be good to fix this, since if you write that file it might get messed up. I performed a couple of tests trying to write the result to a file after doing the above (using a correct UTF-8 file as source): - if you leave fenc to latin1 the new file will be in latin1 (with all the characters correctly encoded) - if you set fenc to utf8 *after* the %!cat (but of course before writing the file) the new file will be in UTF-8 with all the characters correctly encoded - if you set fenc to utf8 *before* the %!cat (and of course before writing the file) the new file will be... a mess: by all appearances Vim thinks that the individual bytes of the UTF-8 file are individual latin1 characters, and it then converts them to UTF-8; so you'll get a UTF-8 encoded file with the wrong characters, e.g. a "C3 B2" sequence in the original file, which stands for a UTF-8 encoded "ò", (Unicode code point F2) will become a "C3 83 C2 B2" sequence in the written file: "C3" is a "Â" in latin1 (and yes, in Unicode too), and "Â" is encoded as "C3 83" in UTF-8, "B2" is a "²" in latin1 (and Unicode) and "²" is encoded as "C2 B2" in UTF-8 (in case someone noticed it, don't let yourself get confused by the fact that C3 and B2 occur both in the source and the translated sequence, that's largely just an unfortunate coincidence of my example). Given that Unicode is identical to latin1 in the first 256 characters, to better confirm what happened I also tried using another charset (cp850) instead of latin1 in the above tests (fencs=cp850 in my vimrc and setting fenc=cp850 in the second and third tests), still using a correct UTF-8 file as a source; the results are analogous, with a correct cp850 file in the first test, a correct UTF-8 one in the second and a UTF-8 one with the original file's bytes interpreted as cp850 and then converted to UTF-8 in the third (the original "ò", "C3 83", becomes a "E2 94 9C E2 96 93" sequence, given that "C3" is a "├" symbol in cp850, Unicode code point 251C -> "E2 94 9C" UTF-8, and 83 is a "▓", Unicode code point 2593 -> "E2 96 93" UTF-8). Yes, I... ahem, had a lot of fun this afternoon :D Cheers -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/d90f2dd2-ef6a-fb16-0118-4f30dc238aba%40tiscali.it.
Re: Changing encoding of an already loaded buffer
On 08/12/2020 14.58, A. Wik wrote: Thanks a lot for the "%!"-idea! That's what I needed. This works: :set fencs=utf8 :%!cat That :%!cat is indeed a neat (if hacky) idea! -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/7b28bffa-70f8-3009-45ff-ce1a85be472c%40tiscali.it.
Re: Changing encoding of an already loaded buffer
On 08/12/2020 10.47, A. Wik wrote: Hi all, I tried a few things: (1) gvim -f ++enc=utf8 - result: "E492: Not an editor command: +enc=utf8 (2) gvim -f +enc=utf8 - result: see (1) (3) gvim -f +"set fenc=utf8" - result: no error message; sets fenc to "utf-8", but file is loaded as if with latin1. (4) gvim -f -c "set fenc=utf8" - result: see (3) (5) gvim -f --cmd "set fenc=utf8" - no error message; fenc remains is "latin1" Yes, I tried stuff like that while perusing the manual a hundred times, it can't work and that's also kind of declared in some points of the documentation; :h fenc is a jungle, and I seem to remember that it's also not completely correct. Basically 'fenc' is only looked at when writing a file, and who knows what the output of that write will be. So essentially, besides 'fencs', the ++enc "opt" (which **has nothing to do with the 'enc' option!!!**) is the only thing that can have an effect when reading a file, and after it's read you better forget about fixing its encoding. The only way forward in my opinion would be to deprecate 'enc', 'fenc', ++enc and probably 'fencs', giving warnings when they do get used, and introduce completely different options and commands. -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/10740b06-b5c1-cc44-9c3e-d5607662214a%40tiscali.it.
Re: Changing encoding of an already loaded buffer
On Tue, 8 Dec 2020 at 16:47, Bram Moolenaar wrote: > > > Albert Wik wrote: > > > > Why does "set fencs=utf8" matter for the "%!cat" operation if Vim is > > not going to change the "fenc" accordingly? > > When reading a file (or filter output) the values in 'fencs' are tried > one by one. Normally when something fails then the next one is tried, > but since reading filter output from a pipe doesn't allow for a retry, > it will always use the first one. Thanks, that is useful to know. > The real problem is that 'fencs' was set to "latin1" at first, thus Vim > didn't even try to use another encoding. Perhaps it also works if you > do that on the command line: > somecommand | vim - -c 'set fencs=utf8,latin1' No, because (according to --help) the command is run after loading the first file. Meanwhile, "--cmd " does not work because it runs the command before sourcing any vimrc file, and so, the new fencs setting gets overwritten by the vimrc. It would be useful to have an option to run a command just *before* loading the first file but after any rc-files. I don't include utf8 in my default fencs setting because that has the side effect of using utf8 for any newly created files. -aw -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/CALPW7mSgAFud82k-rEv4_MjWkPZQy84VRGFm1Yy79ZROEATppw%40mail.gmail.com.
Re: Changing encoding of an already loaded buffer
Albert Wik wrote: > > > Right. The only way I've found is to use a temporary file. > > > Incidentally, the zsh shell makes that easy: > > > % gvim -f =(man llseek) > > > > Assuming that loading the text as latin1 didn't mess it up (since it's > > an 8 bit encoding it should be OK), then you can convert it to utf-8 > > with: > > :set fencs=utf-8,latin1 > > :%!iconv -f latin1 -t utf-8 > > > > Vim might recognize the utf-8 encoding, if not set set 'fenc': > > :set fenc=utf8 > > > > Hopefully that works. > > Thanks a lot for the "%!"-idea! That's what I needed. > > This works: > :set fencs=utf8 > :%!cat > although "fenc" remains "latin1". Yeah, for an existing buffer and filtering the first entry in 'fencs' is used to read the filter output, but 'fenc' isn't set. That's a bit strange, but I'm not sure what would break if we change this. It might actually be good to fix this, since if you write that file it might get messed up. > It is not appropriate to use "iconv -f latin1 -t utf8" (that does in > fact corrupt the data!) because the data is already in UTF-8, and that > is why it is not displayed properly in Vim (because Vim thinks it is > in Latin-1); in particular, the short dash character is shown as > "â<80><90>". When it is displayed properly, a "‐" is shown; putting > the cursor at it and doing "ga" reports that this is character number > 0x2010. > > Why does "set fencs=utf8" matter for the "%!cat" operation if Vim is > not going to change the "fenc" accordingly? When reading a file (or filter output) the values in 'fencs' are tried one by one. Normally when something fails then the next one is tried, but since reading filter output from a pipe doesn't allow for a retry, it will always use the first one. The real problem is that 'fencs' was set to "latin1" at first, thus Vim didn't even try to use another encoding. Perhaps it also works if you do that on the command line: somecommand | vim - -c 'set fencs=utf8,latin1' Didn't try it. Should at least work if you set 'fencs' in your .vimrc. -- If an elephant is left tied to a parking meter, the parking fee has to be paid just as it would for a vehicle. [real standing law in Florida, United States of America] /// Bram Moolenaar -- b...@moolenaar.net -- http://www.Moolenaar.net \\\ ///sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\ \\\ an exciting new programming language -- http://www.Zimbu.org/// \\\help me help AIDS victims -- http://ICCF-Holland.org/// -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/202012081647.0B8GlVCw1678686%40masaka.moolenaar.net.
Re: Changing encoding of an already loaded buffer
On Tue, 8 Dec 2020 at 12:55, Bram Moolenaar wrote: > > > Albert Wik wrote: > > > > Right. The only way I've found is to use a temporary file. > > Incidentally, the zsh shell makes that easy: > > % gvim -f =(man llseek) > > Assuming that loading the text as latin1 didn't mess it up (since it's > an 8 bit encoding it should be OK), then you can convert it to utf-8 > with: > :set fencs=utf-8,latin1 > :%!iconv -f latin1 -t utf-8 > > Vim might recognize the utf-8 encoding, if not set set 'fenc': > :set fenc=utf8 > > Hopefully that works. Thanks a lot for the "%!"-idea! That's what I needed. This works: :set fencs=utf8 :%!cat although "fenc" remains "latin1". It is not appropriate to use "iconv -f latin1 -t utf8" (that does in fact corrupt the data!) because the data is already in UTF-8, and that is why it is not displayed properly in Vim (because Vim thinks it is in Latin-1); in particular, the short dash character is shown as "â<80><90>". When it is displayed properly, a "‐" is shown; putting the cursor at it and doing "ga" reports that this is character number 0x2010. Why does "set fencs=utf8" matter for the "%!cat" operation if Vim is not going to change the "fenc" accordingly? Cheers, Albert. -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/CALPW7mREoMoWYG%2BW26d_vWPiD5bKhU-r5MvY8RSOE3YTj-KZvQ%40mail.gmail.com.
Re: Changing encoding of an already loaded buffer
Albert Wik wrote: > On Mon, 7 Dec 2020 at 20:49, Gabriele F wrote: > > > > The actual "correct" way to "change" the encoding of a buffer is, I > > believe, with the "++enc" option, added either to :e (e.g. `:e > > ++enc=utf8`) or several similar commands such as indeed :vi (`:vi > > ++enc=utf8`). > > Thanks, I didn't know about that. It's more convenient than changing > the "fileencodings". > > > However I couldn't find a way to make it work with a file-less buffer, > > such as your pipe example: > > Right. The only way I've found is to use a temporary file. > Incidentally, the zsh shell makes that easy: > % gvim -f =(man llseek) Assuming that loading the text as latin1 didn't mess it up (since it's an 8 bit encoding it should be OK), then you can convert it to utf-8 with: :set fencs=utf-8,latin1 :%!iconv -f latin1 -t utf-8 Vim might recognize the utf-8 encoding, if not set set 'fenc': :set fenc=utf8 Hopefully that works. -- You can be stopped by the police for biking over 65 miles per hour. You are not allowed to walk across a street on your hands. [real standing laws in Connecticut, United States of America] /// Bram Moolenaar -- b...@moolenaar.net -- http://www.Moolenaar.net \\\ ///sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\ \\\ an exciting new programming language -- http://www.Zimbu.org/// \\\help me help AIDS victims -- http://ICCF-Holland.org/// -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/202012081255.0B8CtN671630556%40masaka.moolenaar.net.
Re: Changing encoding of an already loaded buffer
On Mon, 7 Dec 2020 at 20:49, Gabriele F wrote: > > The actual "correct" way to "change" the encoding of a buffer is, I > believe, with the "++enc" option, added either to :e (e.g. `:e > ++enc=utf8`) or several similar commands such as indeed :vi (`:vi > ++enc=utf8`). Thanks, I didn't know about that. It's more convenient than changing the "fileencodings". > However I couldn't find a way to make it work with a file-less buffer, > such as your pipe example: Right. The only way I've found is to use a temporary file. Incidentally, the zsh shell makes that easy: % gvim -f =(man llseek) Regards, Albert. -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/CALPW7mQKZ1DPRYc%2B_bz%3D8mTFUWfnz2KhDthX7-oDBZE7eY_2BA%40mail.gmail.com.
Re: Changing encoding of an already loaded buffer
Hi all, I tried a few things: (1) gvim -f ++enc=utf8 - result: "E492: Not an editor command: +enc=utf8 (2) gvim -f +enc=utf8 - result: see (1) (3) gvim -f +"set fenc=utf8" - result: no error message; sets fenc to "utf-8", but file is loaded as if with latin1. (4) gvim -f -c "set fenc=utf8" - result: see (3) (5) gvim -f --cmd "set fenc=utf8" - no error message; fenc remains is "latin1" A different approach: (6) (man llseek ; echo 'vim:fenc=utf8:') | gvim -f - result: no error message; fenc gets set to "utf-8"; file is loaded as if with latin1 See also below: On Tue, 8 Dec 2020 at 01:45, Tony Mechelynck wrote: > > If you find out after loading the stdin that it was opened in the > wrong encoding, then it's too late; but if you know the file's > encoding in advance, the should be a way, especially if your > 'encoding' (the charset used internally by Vim) is UTF-8 and if your > Vim is compiled with +iconv. Both conditions hold true. > To be able to detect Latin1 and UTF-8 (and UTF-16 with BOM) automagically, add > set fileencodings=ucs-bom,utf-8,latin1 I tried that months ago. The result was that new files were assumed to have fenc=utf-8, for reasons you mention below. This is not acceptable, so I use "fileencodings=ucs-bom,latin1,cp437" (yes, I know the trailing ",cp437" is pointless). > somewhere in your vimrc (the s at the end of fileencodings is > important); but this isn't enough for files in cp437, especially if > Vim gets them on stdin. For those, load them with (untested) > someprogram | view ++enc=cp437 - I tested it; see top of message. > The above will detect files in 7-bit us-ascii encoding as utf-8 rather > than Latin1. This is not a bug, because the 128 characters which are > valid in us-ascii are represented identically in all three in > us-ascii, Latin1 and UTF-8. Right! Cheers, Albert. -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/CALPW7mQiUGf4-PEUU%2Bi3efpj0VWG7nmueO-OedxKUcij6_MTVA%40mail.gmail.com.
Re: Changing encoding of an already loaded buffer
On Mon, Dec 7, 2020 at 5:40 PM A. Wik wrote: > > Hi all, > > I sometimes need to change the encoding used for a file. I have the > default set to latin1 except for files with an ucs-bom. However, when > I load a file encoded in UTF-8 or CP-437 the default is wrong. What I > do then is normally to ":set fencs=utf8" and ":vi" to reload the file. > > However, what can I do about a file that cannot be reloaded? Eg: > > $ man llseek | gvim -f - > > To work around it, I have to do this: > > $ man llseek > llseek.man > $ gvim llseek.man > > Is there another way? > > Regards, > Albert. If you find out after loading the stdin that it was opened in the wrong encoding, then it's too late; but if you know the file's encoding in advance, the should be a way, especially if your 'encoding' (the charset used internally by Vim) is UTF-8 and if your Vim is compiled with +iconv. To be able to detect Latin1 and UTF-8 (and UTF-16 with BOM) automagically, add set fileencodings=ucs-bom,utf-8,latin1 somewhere in your vimrc (the s at the end of fileencodings is important); but this isn't enough for files in cp437, especially if Vim gets them on stdin. For those, load them with (untested) someprogram | view ++enc=cp437 - (the minus sign at the end is important) which means that you have to know the file's encoding before starting Vim if it is other than UTF-8 or Latin1. Using "view" instead of "vim" on the command-line avoids problems with the 'modified' flag; for ++enc see ":help ++enc". The above will detect files in 7-bit us-ascii encoding as utf-8 rather than Latin1. This is not a bug, because the 128 characters which are valid in us-ascii are represented identically in all three in us-ascii, Latin1 and UTF-8. Best regards, Tony. -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/CAJkCKXtd5YiRQv3wa7GAOwy%3Dq9P1zcGKv0rgQRpr1sw2qO2A0Q%40mail.gmail.com.
Re: Changing encoding of an already loaded buffer
Ah yes, I had also tried passing "-" as a filename for the reload attempts, nope, it was interpreted as an actual "-" file name... -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/c3639889-8787-da73-ed90-e7bdbea86fd4%40tiscali.it.
Re: Changing encoding of an already loaded buffer
The actual "correct" way to "change" the encoding of a buffer is, I believe, with the "++enc" option, added either to :e (e.g. `:e ++enc=utf8`) or several similar commands such as indeed :vi (`:vi ++enc=utf8`). However I couldn't find a way to make it work with a file-less buffer, such as your pipe example: If I use `:e! ++enc=utf8` I'm given an «E32: No file name» error. I thought of passing "%" of "#n" as the filename for :e (`:e ++enc=utf8 %`), but it doesn't work, I'm given a «E499: Empty file name for '%' or '#', only works with ":p:h"» error (and indeed the `:h _%` stuff is described as standing for "file names", not for the actual buffers). Then I tried adding a filename, with `:file whatever`, but once that's done :e! loads a new empty buffer named "whatever"... So there doesn't seem to be a way to really reload (possibly with different encoding options) the current buffer, only to reload the file from which the current buffer was loaded, and so for file-less buffers no way at all. However under Linux and other systems there may well be a way to access the buffer's file's descriptor (/dev/fd/0 ?), so it might work by passing that as the filename. And there's probably some other way by copying the text around. By the way, apparently this also means that you can't even set the encoding of a pipe that you haven't yet created, from the shell, since to the best of my knowledge the only way to set the encoding of a file from the shell, before opening it, is `vim +":e ++enc= "` (which actually means to open it from inside vim). But maybe you can with some more intricate command. I'm far from being Vim expert however, I might well be missing something (or a lot). And encoding stuff is in general quite a mess in Vim, I'll grumble about it one time or another... :/ Cheers -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/07941846-edc4-431c-3889-0c7020254157%40tiscali.it.