Re: How to save a UTF-8 file on Windows using a non-ASCII name

2008-01-21 Fir de Conversatie Edward L. Fox

Hi Fan,

2008/1/21 Fan Decheng [EMAIL PROTECTED]:

 [...]
 
  Here is a snippet from the Vim's reference:
 
  NOTE: Changing this option will not change the encoding of the
  existing text in Vim.  It may cause non-ASCII text to become 
  invalid.
  It should normally be kept at its default value, or set when Vim
  starts up.  See |multibyte|.  To reload the menus see |:menutrans|.

 Thanks. It seems that setting `encoding' before opening the file works.

Yes, it works fine for this case.  But it doesn't necessary means that
it's also OK for all the functions in Vim.  So as a practical
suggestion, never modify the encoding as long as you've already
launched Vim.  Only change this option *ONCE* in your .vimrc.


 



L. F.

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Re: How to save a UTF-8 file on Windows using a non-ASCII name

2008-01-21 Fir de Conversatie Bram Moolenaar


Ben Schmidt wrote:

  Here is a snippet from the Vim's reference:
  
  NOTE: Changing this option will not change the encoding of the
  existing text in Vim.  It may cause non-ASCII text to become invalid.
  It should normally be kept at its default value, or set when Vim
  starts up.  See |multibyte|.  To reload the menus see |:menutrans|.
  
  Personally I think this should be a bug of Vim.  However, as it had
  already been well-documented, I think you should follow the
  principles.
 
 I personally think that's perfectly reasonable and not a bug.
 
 But something I really do think is worth changing because it's really
 confusing, is ++enc. Why do we call this ++enc not ++fenc which would
 make a huge amount more sense, and be more consistent with ++ff and
 ++bin which both set their namesake options? We see evidence of people
 getting 'enc' and 'fenc' confused on a regular basis, and this feature
 naming really doesn't help matters. What would you think of changing
 this, Bram? Perhaps making it officially ++fenc but accepting ++enc
 for compatibility with old scripts (and old users!)?

These are not really option names, although I can see that they are so
similar that people might think that.  They are options for the command.
Offering more alternatives isn't going to make it simpler.  We also
don't have 'fbin' for reading a file in binary mode.

 Also, I wonder whether it might be worth adding a 'best practices'
 section to mbyte.txt and referring to it in such places as 'enc'
 (probably mostly there) which explains the basics in a few short
 paragraphs: set 'enc' in your .vimrc (recommend utf-8), 'tenc' if your
 terminal/locale is different (unneeded in GUI), use ++fenc if a file
 is read with wrong encoding detected, 'fenc' to change what encoding
 to write a file with for future writes. This sort of material is
 repeated frequently on the mailing lists which suggests users aren't
 finding it easily in the help (though it is all there, it is somewhat
 spread around, etc.). Do others think this might or something similar
 might be a good idea?

It will certainly help to update the documentation to explain common
pitfalls.  Writing this in a nice way, without becoming too verbose and
putting it in the right place is not easy.  If you or someone else can
suggest a patch that would be great.

-- 
Wi n0t trei a h0liday in Sweden thi yer?
 Monty Python and the Holy Grail PYTHON (MONTY) PICTURES LTD

 /// Bram Moolenaar -- [EMAIL PROTECTED] -- http://www.Moolenaar.net   \\\
///sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\download, build and distribute -- http://www.A-A-P.org///
 \\\help me help AIDS victims -- http://ICCF-Holland.org///

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Re: How to save a UTF-8 file on Windows using a non-ASCII name

2008-01-21 Fir de Conversatie Ben Schmidt

 But something I really do think is worth changing because it's really
 confusing, is ++enc. Why do we call this ++enc not ++fenc which would
 make a huge amount more sense, and be more consistent with ++ff and
 ++bin which both set their namesake options? We see evidence of people
 getting 'enc' and 'fenc' confused on a regular basis, and this feature
 naming really doesn't help matters. What would you think of changing
 this, Bram? Perhaps making it officially ++fenc but accepting ++enc
 for compatibility with old scripts (and old users!)?
 
 These are not really option names, although I can see that they are so
 similar that people might think that.  They are options for the command.
 Offering more alternatives isn't going to make it simpler.  We also
 don't have 'fbin' for reading a file in binary mode.

Yes, I realise they're not option names, but they undeniably do look like it, 
and 
time and time again we see people getting 'fenc' and 'enc' confused, and ++ff 
sets 
'ff' when reading, ++bin sets 'bin', but ++enc sets 'fenc'. This is only 
confusing 
because there *is* an option called 'enc'. If there weren't, as there isn't a 
'fbin' option, nor a 'format' option, it would be no problem.

But, hey, it's not too important. I just thought I'd throw it out there, as I 
think it has potential to help a lot of users avoid confusion. Maybe I'm wrong. 
It 
would be interesting to hear what others think.

 Also, I wonder whether it might be worth adding a 'best practices'
 section to mbyte.txt and referring to it in such places as 'enc'
 (probably mostly there) which explains the basics in a few short
 paragraphs: set 'enc' in your .vimrc (recommend utf-8), 'tenc' if your
 terminal/locale is different (unneeded in GUI), use ++fenc if a file
 is read with wrong encoding detected, 'fenc' to change what encoding
 to write a file with for future writes. This sort of material is
 repeated frequently on the mailing lists which suggests users aren't
 finding it easily in the help (though it is all there, it is somewhat
 spread around, etc.). Do others think this might or something similar
 might be a good idea?
 
 It will certainly help to update the documentation to explain common
 pitfalls.  Writing this in a nice way, without becoming too verbose and
 putting it in the right place is not easy.  If you or someone else can
 suggest a patch that would be great.

I agree. Terseness is definitely not my strength, though, so I might leave this 
to 
someone else to have a try if anyone is willing. Volunteers?

Ben.



Send instant messages to your online friends http://au.messenger.yahoo.com 


--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Re: How to save a UTF-8 file on Windows using a non-ASCII name

2008-01-20 Fir de Conversatie Linxiao
This is vim-dev maillist, better sending your question to [EMAIL PROTECTED] .

PS. If you wanna save a UTF-8 content file, just :set fenc=utf-8  but
not enc=utf-8.

Good Luck!

On 1/20/08, Fan Decheng [EMAIL PROTECTED] wrote:

 Here I mean on the Windows platform, using Vim 6.4 or 7.1.

 I've encountered this problem several times, but don't know whether
 there is a
 solution:

 1. Use gvim to open a file with Chinese characters in its name. For
 example: 测
 试.txt .
 2. Type :set enc=utf-8 (without quotes).
 3. Type :e to make the file content displayed using utf-8.
 4. Type :wq to save the file.

 After these steps, the file is saved in the name ²âÊÔ.txt rather than the
 original name. Another thing that went wrong is 测试.txt.swp is left
 undeleted.

 I looked for any file name encoding options in vim but failed to find
 anything.
 Any ideas?

 --
   Fan Decheng
   (Robbie Mosaic)
  [EMAIL PROTECTED]



 



-- 
leal @ www.leal.cn

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Re: How to save a UTF-8 file on Windows using a non-ASCII name

2008-01-20 Fir de Conversatie Edward L. Fox
Hi Fan,

On Jan 20, 2008 3:03 PM, Fan Decheng [EMAIL PROTECTED] wrote:

 Here I mean on the Windows platform, using Vim 6.4 or 7.1.

 I've encountered this problem several times, but don't know whether
 there is a
 solution:

 1. Use gvim to open a file with Chinese characters in its name. For
 example: 测
 试.txt .
 2. Type :set enc=utf-8 (without quotes).

Here is a snippet from the Vim's reference:

NOTE: Changing this option will not change the encoding of the
existing text in Vim.  It may cause non-ASCII text to become invalid.
It should normally be kept at its default value, or set when Vim
starts up.  See |multibyte|.  To reload the menus see |:menutrans|.

Personally I think this should be a bug of Vim.  However, as it had
already been well-documented, I think you should follow the
principles.

 3. Type :e to make the file content displayed using utf-8.
 4. Type :wq to save the file.

 After these steps, the file is saved in the name ²âÊÔ.txt rather than the
 original name. Another thing that went wrong is 测试.txt.swp is left
 undeleted.

 I looked for any file name encoding options in vim but failed to find
 anything.

Please carefully read the documentations of the following options:

fencs
fenc
encoding
termencoding

 Any ideas?

 --
Fan Decheng
(Robbie Mosaic)
   [EMAIL PROTECTED]



 



Regards,


L. F.

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Re: How to save a UTF-8 file on Windows using a non-ASCII name

2008-01-20 Fir de Conversatie Tony Mechelynck

Linxiao wrote:
 This is vim-dev maillist, better sending your question to [EMAIL PROTECTED] .
 
 PS. If you wanna save a UTF-8 content file, just :set fenc=utf-8  but
 not enc=utf-8.
 
 Good Luck!

Tt, tt, tt... If 'encoding' is other than UTF-8 (or GB18030), Vim cannot 
represent all Unicode codepoints in memory; therefore, if you try to edit a 
UTF-8 file you run the risk of losing part of the data. (If you set 'enc' to 
UTF-16, UCS-2 or UCS-4 aka UTF-32, with any endianness, what Vim will use is 
actually UTF-8.)

To edit UTF-8 data you should have both 'encoding' (= memory representation of 
the data) and 'fileencoding (= disk representation of the data) set to UTF-8.

 
 On 1/20/08, Fan Decheng [EMAIL PROTECTED] wrote:
 Here I mean on the Windows platform, using Vim 6.4 or 7.1.

 I've encountered this problem several times, but don't know whether
 there is a
 solution:

 1. Use gvim to open a file with Chinese characters in its name. For
 example: 测
 试.txt .
 2. Type :set enc=utf-8 (without quotes).
 3. Type :e to make the file content displayed using utf-8.
 4. Type :wq to save the file.

 After these steps, the file is saved in the name ²âÊÔ.txt rather than the
 original name. Another thing that went wrong is 测试.txt.swp is left
 undeleted.

 I looked for any file name encoding options in vim but failed to find
 anything.
 Any ideas?

 --
   Fan Decheng
   (Robbie Mosaic)
  [EMAIL PROTECTED]



 
 

Best regards,
Tony.
-- 
During a grouse hunt in North Carolina two intrepid sportsmen
were blasting away at a clump of trees near a stone wall.  Suddenly a
red-faced country squire popped his head over the wall and shouted,
Hey, you almost hit my wife.
Did I?  cried the hunter, aghast.  Terribly sorry.  Have a
shot at mine, over there.

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Re: How to save a UTF-8 file on Windows using a non-ASCII name

2008-01-20 Fir de Conversatie Edward L. Fox
Hi Tony,

On Jan 21, 2008 11:41 AM, Tony Mechelynck [EMAIL PROTECTED] wrote:

 Linxiao wrote:
 [...]

 Tt, tt, tt... If 'encoding' is other than UTF-8 (or GB18030), Vim cannot
 represent all Unicode codepoints in memory; therefore, if you try to edit a
 UTF-8 file you run the risk of losing part of the data. (If you set 'enc' to
 UTF-16, UCS-2 or UCS-4 aka UTF-32, with any endianness, what Vim will use is
 actually UTF-8.)

I'm familiar with different shapes of malformed characters.  In fact
the *thread-host*'s problem was not caused by the code points losing.
²âÊÔ was generated by the following steps:

1. At first, the thread-host represents 测试 in GBK encoding.

2. Then he re-sets the encoding to UTF-8.  So the filename information
in Vim gets lost.  Vim re-interprets the filename as Latin-1.

3. Vim converts the latin-1 string to UTF-8.

4. Vim saves the file to the disk with the new name.  Windows will
convert the UTF-8 string to UCS, of course.  Now the new filename is
exactly ²âÊÔ.

Here is the illustration (my system charset is UTF-8):

[EMAIL PROTECTED] ~]$ echo 测试 | iconv -f utf-8 -t gbk | iconv -f latin1 -t utf-8
²âÊÔ

 To edit UTF-8 data you should have both 'encoding' (= memory representation of
 the data) and 'fileencoding (= disk representation of the data) set to UTF-8.

 [...]

 Best regards,
 Tony.
 --
 During a grouse hunt in North Carolina two intrepid sportsmen
 were blasting away at a clump of trees near a stone wall.  Suddenly a
 red-faced country squire popped his head over the wall and shouted,
 Hey, you almost hit my wife.
 Did I?  cried the hunter, aghast.  Terribly sorry.  Have a
 shot at mine, over there.


 



Regards,


L. F.

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Re: How to save a UTF-8 file on Windows using a non-ASCII name

2008-01-20 Fir de Conversatie Fan Decheng

On Jan 21, 11:41 am, Tony Mechelynck [EMAIL PROTECTED]
wrote:
 Linxiao wrote:
  This is vim-dev maillist, better sending your question to [EMAIL PROTECTED] 
  .

  PS. If you wanna save a UTF-8 content file, just :set fenc=utf-8  but
  not enc=utf-8.

  Good Luck!

 Tt, tt, tt... If 'encoding' is other than UTF-8 (or GB18030), Vim cannot
 represent all Unicode codepoints in memory; therefore, if you try to edit a
 UTF-8 file you run the risk of losing part of the data. (If you set 'enc' to
 UTF-16, UCS-2 or UCS-4 aka UTF-32, with any endianness, what Vim will use is
 actually UTF-8.)

 To edit UTF-8 data you should have both 'encoding' (= memory representation of
 the data) and 'fileencoding (= disk representation of the data) set to UTF-8.


 Best regards,
 Tony.

Thanks! I tried these:

D:\testgvim
:set fenc=utf-8
:set enc=utf-8
:e 测试.txt
E37: No write since last change (add ! to override)
:e! 测试.txt
:set fenc?
fileencoding=utf-8
:set enc?
encoding=utf-8

It is all-right.  Note that in the above test I opened gvim with an
empty
buffer first.  Another test shows some problems that I should take
care:

D:\testgvim 测试.txt
:set fenc?
fileencoding=
:set enc?
enc=cp936

Of course, under this situation the file contents are in a wrong
encoding.
Then I tried these:

:set enc=utf-8

After this, the window title has changed to:
b2e2cad4.txt (D:\test) - GVIM
:set fenc?
fileencoding=
:set enc?
encoding=utf-8

I proceed with:
:e

The window title became:
b2e2cad4.txt = (D:\test) - GVIM
:set fenc?
fileencoding=utf-8
:set enc?
encoding=utf-8

However as I observed, the file name of the .swp file is still
.测试.txt.swp.

To make the window title correct and make the file non-read-only
(writable), I typed:

:e 测试.txt

A new buffer is opened and the window title changed back to the
original:
测试.txt = (D:\test) - GVIM

Now every write to the file is OK.  However after exiting gvim, the
swap
file is still there.

Sorry for writing this long, just for some reference.  I've read the
help
for `fencs', but I did't find it helpful to this situation.


--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Re: How to save a UTF-8 file on Windows using a non-ASCII name

2008-01-20 Fir de Conversatie Fan Decheng

On Jan 21, 11:43 am, Edward L. Fox [EMAIL PROTECTED] wrote:
 Hi Fan,

 On Jan 20, 2008 3:03 PM, Fan Decheng [EMAIL PROTECTED] wrote:



  Here I mean on the Windows platform, using Vim 6.4 or 7.1.

  I've encountered this problem several times, but don't know whether
  there is a
  solution:

  1. Use gvim to open a file with Chinese characters in its name. For
  example: 测
  试.txt .
  2. Type :set enc=utf-8 (without quotes).

 Here is a snippet from the Vim's reference:

 NOTE: Changing this option will not change the encoding of the
 existing text in Vim.  It may cause non-ASCII text to become invalid.
 It should normally be kept at its default value, or set when Vim
 starts up.  See |multibyte|.  To reload the menus see |:menutrans|.

Thanks. It seems that setting `encoding' before opening the file works.
--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---