On 30/08/13 16:51, Bram Moolenaar wrote:

Patch 7.4.013
Problem:    File name buffer too small for utf-8.
Solution:   Use character count instead of byte count. (Ken Takata)
Files:      src/os_mswin.c


*** ../vim-7.4.012/src/os_mswin.c       2013-08-30 16:44:15.000000000 +0200
--- src/os_mswin.c      2013-08-30 16:47:54.000000000 +0200
***************
*** 456,462 ****
--- 456,469 ----
       int
   mch_isFullName(char_u *fname)
   {
+ #ifdef FEAT_MBYTE
+     /* WinNT and later can use _MAX_PATH wide characters for a pathname, which
+      * means that the maximum pathname is _MAX_PATH * 3 bytes when 'enc' is
+      * UTF-8. */
+     char szName[_MAX_PATH * 3 + 1];
+ #else
       char szName[_MAX_PATH + 1];
+ #endif

       /* A name like "d:/foo" and "//server/share" is absolute */
       if ((fname[0] && fname[1] == ':' && (fname[2] == '/' || fname[2] == 
'\\'))
***************
[...]
***************
*** 498,507 ****
       int
   vim_stat(const char *name, struct stat *stp)
   {
       char     buf[_MAX_PATH + 1];
       char     *p;

!     vim_strncpy((char_u *)buf, (char_u *)name, _MAX_PATH);
       p = buf + strlen(buf);
       if (p > buf)
        mb_ptr_back(buf, p);
--- 505,521 ----
       int
   vim_stat(const char *name, struct stat *stp)
   {
+ #ifdef FEAT_MBYTE
+     /* WinNT and later can use _MAX_PATH wide characters for a pathname, which
+      * means that the maximum pathname is _MAX_PATH * 3 bytes when 'enc' is
+      * UTF-8. */
+     char      buf[_MAX_PATH * 3 + 1];
+ #else
       char     buf[_MAX_PATH + 1];
+ #endif
       char     *p;

!     vim_strncpy((char_u *)buf, (char_u *)name, sizeof(buf) - 1);
       p = buf + strlen(buf);
       if (p > buf)
        mb_ptr_back(buf, p);
*** ../vim-7.4.012/src/version.c        2013-08-30 16:44:15.000000000 +0200
--- src/version.c       2013-08-30 16:47:36.000000000 +0200
***************
[...]

Note: Unicode codepoints above U+FFFF require 4 bytes each in UTF-8. Up to U+10FFFF they can be represented in UTF-16 by means of a surrogate pair (two 16-bit words). I don't know how frequent those "high" codepoints are in practice. IIUC some of them are "rare" hanzi/kanji still used in Chinese and/or Japanese family names. If they are each counted by Windows as "two characters" for the purpose of determining if _MAX_PATH has been exceeded then the above logic is correct.


Best regards,
Tony.
--
Greener's Law:
        Never argue with a man who buys ink by the barrel.

--
--
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

--- You received this message because you are subscribed to the Google Groups "vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Raspunde prin e-mail lui