Re: [fossil-users] Question about the file formats.

Warren Young Tue, 20 Dec 2016 14:58:02 -0800

On Dec 20, 2016, at 12:35 PM, John Found <johnfo...@asm32.info> wrote:
> 
> Under "fossil algorithms" I mean two (in my understanding most important in 
> what is called "version control": diff algorithm and 3-way merge algorithm.


When I said that Fossil can’t diff two binary files, I meant that it couldn’t 
display a sensible difference to the terminal when you give the “fossil diff” 
command.  However, Fossil *can store* the difference between any two files, 
regardless of binary vs. text, as I suggested with my uncompressed TIFF example.

Fossil will even do so for files like PNGs where the worst case is that a 
single bit change in the original file could potentially change every byte in 
the output file, making the internal diffs Fossil stores very large, possibly 
to the point that there’s no value in delta compression at all, so that Fossil 
must simply store both versions in toto.  But Fossil will store those versions, 
and retrieve them.

As for merging, as long as the two versions Fossil is trying to merge have 
sufficient context between the changes to safely do the merge automatically, 
Fossil will do so.

Just as with diffing, if you use compression or encryption or otherwise cause 
the merged parts to overlap, Fossil won’t be able to do the merge automatically.

This is no different for what we choose to call “text” files, where if two 
users make a change to the same area of a single file, chances are high that 
Fossil will refuse to attempt an automatic merge, since there isn’t enough 
context between the changes for Fossil to be sure it isn’t creating a mess in 
the merge area.

> Or what makes the 3-way merge algorithm not working on binary files.

Except for whole-file compression and similar cases (e.g. pre-checkin 
encryption) I don’t think you can create a replicable test that shows that it 
doesn’t work.

> What if I design some text file format (containing only ascii characters) and
> it can't be properly processed by fossil?

Then you should post it as a replicable test case for our study.  Until you can 
do both things — i.e. cause a problem and create a replicable test case for it 
— you’re just speculating.

> Another example: Every binary file can be BASE64 encoded and it will be 
> turned into a 
> valid text file. Fossil will not detect it as a binary. But whether this file 
> will be
> processed properly on diffs and merges? Probably not. But why?

I don’t believe such an encoding will have a meaningful effect on any test, 
except that it effectively adds newlines every 70-some characters, where the 
original binary data might not have it, so “binary” data would now be detected 
as “text” data.

But, if the problem is that delta compression is inefficient with a given 
binary file because nearly every byte changes when you change just one small 
bit of the input file, then the same will be true of the Base64-encoded version.
_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Re: [fossil-users] Question about the file formats.

Reply via email to