Jan Nijtmans wrote:
> 
> Both NUL-bytes and long lines no longer
> abort the function. This has the adverse effect that even
> binary files containing NUL bytes are always scanned
> completely, even though we know already that the file
> is binary.
> 

Actually, given the variety of possible text encodings, we know
very little with absolute certainty.

>
> So, I would like to suggest to change looks_like_utf8/16() such
> that NUL-bytes mean that scanning the remaing bytes/chars
> of the blob is aborted, but not for long lines.
>

If I'm understanding you correctly, this will further complicate
the code without much benefit.

>
> This has no effect on fossil: LOOK_NUL has the highest
> priority of all flags, it should always be checked first
> by all calling code.
>

That should be up to the calling code.  The looks_like_utf*() should
treat all flags with more-or-less equal priority.

>
> All other flags are meaningless whenever LOOK_NUL is iset.
> 

That is not necessarily the case.  Again, this should be up to the
calling code.

>
> Any objections?
> 

Yes, see above.

--
Joe Mistachkin

_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Reply via email to