https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=202290
--- Comment #1 from la...@fit.vutbr.cz --- Looking at /usr/src/contrib/nvi/common/exf.c file_encinit(SCR *sp) ... if (looks_utf8(buf, blen) > 1) o_set(sp, O_FILEENCODING, OS_STRDUP, "utf-8", 0); else if (!O_ISSET(sp, O_FILEENCODING) || !strncasecmp(O_STR(sp, O_FILEENCODING), "utf-8", 5)) o_set(sp, O_FILEENCODING, OS_STRDUP, codeset(), 0); conv_enc(sp, O_FILEENCODING, 0); } 1. There is no way how to disable auto detection of encoding, if looks_utf8() returns 2, then there you are lost!!! You can setup your .exrc, but it will be ignored!!! 2. But why looks_utf() detects 0xe1 0x20 as valid utf-8? IT IS NOT VALID! Looking at /usr/src/contrib/nvi/common/encoding.c looks_utf8(const char *ibuf, size_t nbytes) ... for (n = 0; n < following; n++) { i++; if (i >= nbytes) goto done; if (buf[i] & 0x40) /* 10xxxxxx */ return -1; } That's completely wrong, it doesn't test if bit 7 is set in succeeding bytes! It should be: for (n = 0; n < following; n++) { i++; if (i >= nbytes) goto done; if ((buf[i] & 0xc0) != 0x10) /* 10xxxxxx */ return -1; } This change is was tested and works. Please fix at least broken "auto detection" before 10.2-RELEASE! But some option to disable auto-detection or honor user setting in .exrc is also required. -- You are receiving this mail because: You are the assignee for the bug. _______________________________________________ freebsd-bugs@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-bugs To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"