This line can be found twice in the fossil source code,
and it refers to the functions looks_like_utf8() and
looks_like_utf16() (src/diff.c lines 233 and 336).

In Fossil 1.25 and earlier,  looks_like_utf8/16 bailed out as soon
as either a NUL byte or a long line was encountered. Stopping
at long lines is a bad idea, because a following NUL byte might
not be detected. In commit [13fac7f74a]:
    <https://www.fossil-scm.org/index.html/info/13fac7f74a>
this is fixed. Both NUL-bytes and long lines no longer
abort the function. This has the adverse effect that even
binary files containing NUL bytes are always scanned
completely, even though we know already that the file
is binary.

So, I would like to suggest to change looks_like_utf8/16() such
that NUL-bytes mean that scanning the remaing bytes/chars
of the blob is aborted, but not for long lines. This has no
effect on fossil: LOOK_NUL has the highest priority
of all flags, it should always be checked first by all
calling code. All other flags are meaningless whenever
LOOK_NUL is iset.

Any objections?

Regards,
           Jan Nijtmans
_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Reply via email to