Re: How to save a UTF-8 file on Windows using a non-ASCII name
Hi Fan, 2008/1/21 Fan Decheng [EMAIL PROTECTED]: [...] Here is a snippet from the Vim's reference: NOTE: Changing this option will not change the encoding of the existing text in Vim. It may cause non-ASCII text to become invalid. It should normally be kept at its default value, or set when Vim starts up. See |multibyte|. To reload the menus see |:menutrans|. Thanks. It seems that setting `encoding' before opening the file works. Yes, it works fine for this case. But it doesn't necessary means that it's also OK for all the functions in Vim. So as a practical suggestion, never modify the encoding as long as you've already launched Vim. Only change this option *ONCE* in your .vimrc. L. F. --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Re: How to save a UTF-8 file on Windows using a non-ASCII name
Ben Schmidt wrote: Here is a snippet from the Vim's reference: NOTE: Changing this option will not change the encoding of the existing text in Vim. It may cause non-ASCII text to become invalid. It should normally be kept at its default value, or set when Vim starts up. See |multibyte|. To reload the menus see |:menutrans|. Personally I think this should be a bug of Vim. However, as it had already been well-documented, I think you should follow the principles. I personally think that's perfectly reasonable and not a bug. But something I really do think is worth changing because it's really confusing, is ++enc. Why do we call this ++enc not ++fenc which would make a huge amount more sense, and be more consistent with ++ff and ++bin which both set their namesake options? We see evidence of people getting 'enc' and 'fenc' confused on a regular basis, and this feature naming really doesn't help matters. What would you think of changing this, Bram? Perhaps making it officially ++fenc but accepting ++enc for compatibility with old scripts (and old users!)? These are not really option names, although I can see that they are so similar that people might think that. They are options for the command. Offering more alternatives isn't going to make it simpler. We also don't have 'fbin' for reading a file in binary mode. Also, I wonder whether it might be worth adding a 'best practices' section to mbyte.txt and referring to it in such places as 'enc' (probably mostly there) which explains the basics in a few short paragraphs: set 'enc' in your .vimrc (recommend utf-8), 'tenc' if your terminal/locale is different (unneeded in GUI), use ++fenc if a file is read with wrong encoding detected, 'fenc' to change what encoding to write a file with for future writes. This sort of material is repeated frequently on the mailing lists which suggests users aren't finding it easily in the help (though it is all there, it is somewhat spread around, etc.). Do others think this might or something similar might be a good idea? It will certainly help to update the documentation to explain common pitfalls. Writing this in a nice way, without becoming too verbose and putting it in the right place is not easy. If you or someone else can suggest a patch that would be great. -- Wi n0t trei a h0liday in Sweden thi yer? Monty Python and the Holy Grail PYTHON (MONTY) PICTURES LTD /// Bram Moolenaar -- [EMAIL PROTECTED] -- http://www.Moolenaar.net \\\ ///sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\ \\\download, build and distribute -- http://www.A-A-P.org/// \\\help me help AIDS victims -- http://ICCF-Holland.org/// --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Re: How to save a UTF-8 file on Windows using a non-ASCII name
But something I really do think is worth changing because it's really confusing, is ++enc. Why do we call this ++enc not ++fenc which would make a huge amount more sense, and be more consistent with ++ff and ++bin which both set their namesake options? We see evidence of people getting 'enc' and 'fenc' confused on a regular basis, and this feature naming really doesn't help matters. What would you think of changing this, Bram? Perhaps making it officially ++fenc but accepting ++enc for compatibility with old scripts (and old users!)? These are not really option names, although I can see that they are so similar that people might think that. They are options for the command. Offering more alternatives isn't going to make it simpler. We also don't have 'fbin' for reading a file in binary mode. Yes, I realise they're not option names, but they undeniably do look like it, and time and time again we see people getting 'fenc' and 'enc' confused, and ++ff sets 'ff' when reading, ++bin sets 'bin', but ++enc sets 'fenc'. This is only confusing because there *is* an option called 'enc'. If there weren't, as there isn't a 'fbin' option, nor a 'format' option, it would be no problem. But, hey, it's not too important. I just thought I'd throw it out there, as I think it has potential to help a lot of users avoid confusion. Maybe I'm wrong. It would be interesting to hear what others think. Also, I wonder whether it might be worth adding a 'best practices' section to mbyte.txt and referring to it in such places as 'enc' (probably mostly there) which explains the basics in a few short paragraphs: set 'enc' in your .vimrc (recommend utf-8), 'tenc' if your terminal/locale is different (unneeded in GUI), use ++fenc if a file is read with wrong encoding detected, 'fenc' to change what encoding to write a file with for future writes. This sort of material is repeated frequently on the mailing lists which suggests users aren't finding it easily in the help (though it is all there, it is somewhat spread around, etc.). Do others think this might or something similar might be a good idea? It will certainly help to update the documentation to explain common pitfalls. Writing this in a nice way, without becoming too verbose and putting it in the right place is not easy. If you or someone else can suggest a patch that would be great. I agree. Terseness is definitely not my strength, though, so I might leave this to someone else to have a try if anyone is willing. Volunteers? Ben. Send instant messages to your online friends http://au.messenger.yahoo.com --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Re: How to save a UTF-8 file on Windows using a non-ASCII name
This is vim-dev maillist, better sending your question to [EMAIL PROTECTED] . PS. If you wanna save a UTF-8 content file, just :set fenc=utf-8 but not enc=utf-8. Good Luck! On 1/20/08, Fan Decheng [EMAIL PROTECTED] wrote: Here I mean on the Windows platform, using Vim 6.4 or 7.1. I've encountered this problem several times, but don't know whether there is a solution: 1. Use gvim to open a file with Chinese characters in its name. For example: 测 试.txt . 2. Type :set enc=utf-8 (without quotes). 3. Type :e to make the file content displayed using utf-8. 4. Type :wq to save the file. After these steps, the file is saved in the name ²âÊÔ.txt rather than the original name. Another thing that went wrong is 测试.txt.swp is left undeleted. I looked for any file name encoding options in vim but failed to find anything. Any ideas? -- Fan Decheng (Robbie Mosaic) [EMAIL PROTECTED] -- leal @ www.leal.cn --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Re: How to save a UTF-8 file on Windows using a non-ASCII name
Hi Fan, On Jan 20, 2008 3:03 PM, Fan Decheng [EMAIL PROTECTED] wrote: Here I mean on the Windows platform, using Vim 6.4 or 7.1. I've encountered this problem several times, but don't know whether there is a solution: 1. Use gvim to open a file with Chinese characters in its name. For example: 测 试.txt . 2. Type :set enc=utf-8 (without quotes). Here is a snippet from the Vim's reference: NOTE: Changing this option will not change the encoding of the existing text in Vim. It may cause non-ASCII text to become invalid. It should normally be kept at its default value, or set when Vim starts up. See |multibyte|. To reload the menus see |:menutrans|. Personally I think this should be a bug of Vim. However, as it had already been well-documented, I think you should follow the principles. 3. Type :e to make the file content displayed using utf-8. 4. Type :wq to save the file. After these steps, the file is saved in the name ²âÊÔ.txt rather than the original name. Another thing that went wrong is 测试.txt.swp is left undeleted. I looked for any file name encoding options in vim but failed to find anything. Please carefully read the documentations of the following options: fencs fenc encoding termencoding Any ideas? -- Fan Decheng (Robbie Mosaic) [EMAIL PROTECTED] Regards, L. F. --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Re: How to save a UTF-8 file on Windows using a non-ASCII name
Linxiao wrote: This is vim-dev maillist, better sending your question to [EMAIL PROTECTED] . PS. If you wanna save a UTF-8 content file, just :set fenc=utf-8 but not enc=utf-8. Good Luck! Tt, tt, tt... If 'encoding' is other than UTF-8 (or GB18030), Vim cannot represent all Unicode codepoints in memory; therefore, if you try to edit a UTF-8 file you run the risk of losing part of the data. (If you set 'enc' to UTF-16, UCS-2 or UCS-4 aka UTF-32, with any endianness, what Vim will use is actually UTF-8.) To edit UTF-8 data you should have both 'encoding' (= memory representation of the data) and 'fileencoding (= disk representation of the data) set to UTF-8. On 1/20/08, Fan Decheng [EMAIL PROTECTED] wrote: Here I mean on the Windows platform, using Vim 6.4 or 7.1. I've encountered this problem several times, but don't know whether there is a solution: 1. Use gvim to open a file with Chinese characters in its name. For example: 测 试.txt . 2. Type :set enc=utf-8 (without quotes). 3. Type :e to make the file content displayed using utf-8. 4. Type :wq to save the file. After these steps, the file is saved in the name ²âÊÔ.txt rather than the original name. Another thing that went wrong is 测试.txt.swp is left undeleted. I looked for any file name encoding options in vim but failed to find anything. Any ideas? -- Fan Decheng (Robbie Mosaic) [EMAIL PROTECTED] Best regards, Tony. -- During a grouse hunt in North Carolina two intrepid sportsmen were blasting away at a clump of trees near a stone wall. Suddenly a red-faced country squire popped his head over the wall and shouted, Hey, you almost hit my wife. Did I? cried the hunter, aghast. Terribly sorry. Have a shot at mine, over there. --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Re: How to save a UTF-8 file on Windows using a non-ASCII name
Hi Tony, On Jan 21, 2008 11:41 AM, Tony Mechelynck [EMAIL PROTECTED] wrote: Linxiao wrote: [...] Tt, tt, tt... If 'encoding' is other than UTF-8 (or GB18030), Vim cannot represent all Unicode codepoints in memory; therefore, if you try to edit a UTF-8 file you run the risk of losing part of the data. (If you set 'enc' to UTF-16, UCS-2 or UCS-4 aka UTF-32, with any endianness, what Vim will use is actually UTF-8.) I'm familiar with different shapes of malformed characters. In fact the *thread-host*'s problem was not caused by the code points losing. ²âÊÔ was generated by the following steps: 1. At first, the thread-host represents 测试 in GBK encoding. 2. Then he re-sets the encoding to UTF-8. So the filename information in Vim gets lost. Vim re-interprets the filename as Latin-1. 3. Vim converts the latin-1 string to UTF-8. 4. Vim saves the file to the disk with the new name. Windows will convert the UTF-8 string to UCS, of course. Now the new filename is exactly ²âÊÔ. Here is the illustration (my system charset is UTF-8): [EMAIL PROTECTED] ~]$ echo 测试 | iconv -f utf-8 -t gbk | iconv -f latin1 -t utf-8 ²âÊÔ To edit UTF-8 data you should have both 'encoding' (= memory representation of the data) and 'fileencoding (= disk representation of the data) set to UTF-8. [...] Best regards, Tony. -- During a grouse hunt in North Carolina two intrepid sportsmen were blasting away at a clump of trees near a stone wall. Suddenly a red-faced country squire popped his head over the wall and shouted, Hey, you almost hit my wife. Did I? cried the hunter, aghast. Terribly sorry. Have a shot at mine, over there. Regards, L. F. --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Re: How to save a UTF-8 file on Windows using a non-ASCII name
On Jan 21, 11:41 am, Tony Mechelynck [EMAIL PROTECTED] wrote: Linxiao wrote: This is vim-dev maillist, better sending your question to [EMAIL PROTECTED] . PS. If you wanna save a UTF-8 content file, just :set fenc=utf-8 but not enc=utf-8. Good Luck! Tt, tt, tt... If 'encoding' is other than UTF-8 (or GB18030), Vim cannot represent all Unicode codepoints in memory; therefore, if you try to edit a UTF-8 file you run the risk of losing part of the data. (If you set 'enc' to UTF-16, UCS-2 or UCS-4 aka UTF-32, with any endianness, what Vim will use is actually UTF-8.) To edit UTF-8 data you should have both 'encoding' (= memory representation of the data) and 'fileencoding (= disk representation of the data) set to UTF-8. Best regards, Tony. Thanks! I tried these: D:\testgvim :set fenc=utf-8 :set enc=utf-8 :e 测试.txt E37: No write since last change (add ! to override) :e! 测试.txt :set fenc? fileencoding=utf-8 :set enc? encoding=utf-8 It is all-right. Note that in the above test I opened gvim with an empty buffer first. Another test shows some problems that I should take care: D:\testgvim 测试.txt :set fenc? fileencoding= :set enc? enc=cp936 Of course, under this situation the file contents are in a wrong encoding. Then I tried these: :set enc=utf-8 After this, the window title has changed to: b2e2cad4.txt (D:\test) - GVIM :set fenc? fileencoding= :set enc? encoding=utf-8 I proceed with: :e The window title became: b2e2cad4.txt = (D:\test) - GVIM :set fenc? fileencoding=utf-8 :set enc? encoding=utf-8 However as I observed, the file name of the .swp file is still .测试.txt.swp. To make the window title correct and make the file non-read-only (writable), I typed: :e 测试.txt A new buffer is opened and the window title changed back to the original: 测试.txt = (D:\test) - GVIM Now every write to the file is OK. However after exiting gvim, the swap file is still there. Sorry for writing this long, just for some reference. I've read the help for `fencs', but I did't find it helpful to this situation. --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Re: How to save a UTF-8 file on Windows using a non-ASCII name
On Jan 21, 11:43 am, Edward L. Fox [EMAIL PROTECTED] wrote: Hi Fan, On Jan 20, 2008 3:03 PM, Fan Decheng [EMAIL PROTECTED] wrote: Here I mean on the Windows platform, using Vim 6.4 or 7.1. I've encountered this problem several times, but don't know whether there is a solution: 1. Use gvim to open a file with Chinese characters in its name. For example: 测 试.txt . 2. Type :set enc=utf-8 (without quotes). Here is a snippet from the Vim's reference: NOTE: Changing this option will not change the encoding of the existing text in Vim. It may cause non-ASCII text to become invalid. It should normally be kept at its default value, or set when Vim starts up. See |multibyte|. To reload the menus see |:menutrans|. Thanks. It seems that setting `encoding' before opening the file works. --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---