Jan Nijtmans wrote: > > Both NUL-bytes and long lines no longer > abort the function. This has the adverse effect that even > binary files containing NUL bytes are always scanned > completely, even though we know already that the file > is binary. >
Actually, given the variety of possible text encodings, we know very little with absolute certainty. > > So, I would like to suggest to change looks_like_utf8/16() such > that NUL-bytes mean that scanning the remaing bytes/chars > of the blob is aborted, but not for long lines. > If I'm understanding you correctly, this will further complicate the code without much benefit. > > This has no effect on fossil: LOOK_NUL has the highest > priority of all flags, it should always be checked first > by all calling code. > That should be up to the calling code. The looks_like_utf*() should treat all flags with more-or-less equal priority. > > All other flags are meaningless whenever LOOK_NUL is iset. > That is not necessarily the case. Again, this should be up to the calling code. > > Any objections? > Yes, see above. -- Joe Mistachkin _______________________________________________ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users