Zhihao Yuan wrote:
On Mon, Jun 22, 2020 at 5:24 PM Gleb Smirnoff <gleb...@freebsd.org <mailto:gleb...@freebsd.org>> wrote:


    My first attempt was this fix:

    --- common/exf.c        (revision 362200)
    +++ common/exf.c        (working copy)
    @@ -1252,7 +1252,8 @@ file_encinit(SCR *sp)
             else if (O_ISSET(sp, O_FILEENCODING) &&
                 strcasecmp(O_STR(sp, O_FILEENCODING), "utf-8") != 0)
                     /* Use fileencoding as is */ ;
    -       else if (strcasecmp(codeset(), "utf-8") != 0)
    +       else if (strncasecmp(codeset() + strlen(codeset()) - 5,
    "utf-8", 5) !=
    +           0)
                     o_set(sp, O_FILEENCODING, OS_STRDUP, codeset(), 0);
             else
                     o_set(sp, O_FILEENCODING, OS_STRDUP, "iso8859-1", 0);

    But it appeared to be not the case. To my surprise, codeset()
    which is wrapper around nl_langinfo() in my case returns US-ASCII.


That sounds strange.

   1. Can you set LC_CTYPE as well and see
     if anything changes?
   2. Can you revert to the previous version
     and see what nl_langinfo gives?

There is another issue... I'm sorry.  I totally forgot what
looks_utf8 actually does.

Here is its behavior (encoding.c):

  Returns
  -1: invalid UTF-8
   0: uses odd control characters, so doesn't look like text
   1: 7-bit text
   2: definitely UTF-8 text (valid high-bit set bytes)

So if looks_utf8() > 1, it means the file itself is UTF-8
for sure.  If you opened a file with 7-bit text or with
control characters, :set fileencoding should set
the encoding intended to write.  But the HEAD
behaviors is that you can't input Unicode.

I'm reverting upstream.

Yes, I will revert for now.
_______________________________________________
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Reply via email to