On Dec 20, 2016, at 11:48 AM, John Found <johnfo...@asm32.info> wrote: > > I know that fossil (and most other version control systems) can handle > properly > only text source files.
Says who? There are some features of Fossil that simply don’t work when given a binary file, like “fossil diff,” but if you think this is a missing feature (or even a bug!) I’d have to ask how you think it should work? Consider the case of a PNG. How would you expect “fossil diff” to show the difference between two PNGs? Now multiply by the number of other binary file formats. It is also the case that checking in compressed binary files is generally a mistake, since that will largely defeat the built-in diffing and compression mechanisms in Fossil, bloating the repository on every checkin. (For some use cases, you can now avoid this problem with the new unversioned files feature.) Both of those classes of problem aside, Fossil will certainly accept “binary” files. > What makes the binary files different from the text files? The presence or > absence of > 0 bytes does not seems to make serious difference for processing by the same > algorithms. Fossil uses a heuristic to decide if a given file is “binary” or not, and it has more to do with the chance that it will display properly when served to a web browser than anything else. Because it is a heuristic, it is possible to trick it. For example, very long text lines may be misdetected as a “binary” file, because it runs out of buffer space looking for the first line terminator. > What properties a file format needs in order to be processed properly by > fossil? Give a specific use case. The answer differs depending on what Fossil commands you want to be able to use on the files you check in. I gave the “diff” case above, but that is not the only command that changes behavior depending on whether the binary file heuristic decides that the file is “binary.” I’m putting “binary” in quotes because it is not a clear-cut distinction. For Fossil’s purposes, an uncompressed TIFF is “less binary” than a PNG file, because it is possible to do useful levels of delta compression on the TIFF but not on the PNG. > Is it enough for a file to contains only utf-8 characters or some other > properties are > mandatory as well? If you want to know the heuristic’s current implementation details, study looks_like_utf8() in src/lookslike.c. (There is also a UTF-16 version of that function, typically needed on Windows.) > Is it possible to define such binary file format that to be properly processed > by fossil (of course, after removing the explicit binary file checks)? > > Or the opposite question: Is it possible to compose such text file that to > not be > processed properly by fossil algorithms? Both questions should be answered by a study of that heuristic function. If you have further questions, make your questions more specific. Your current questions are so vague that I can give the answer “Yes” to both, and be correct. Not useful, I realize, but correct. :) _______________________________________________ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users