My recent experiments with fossil, git and mercurial were so
interesting that I thought in these days how to optimize the fossil's
repository size.

I know that local storage is cheap, but I also think that it could be
a good feature for fossil to achieve a very good compression ratio.

As the title of this mail suggests I thought that it could be a
feature of fossil to "fossilize" the oldest artifacts in a repository
in an efficient way; like git creates object "packs" fossil could
create "super-fossils", that is fossils composed of other fossils
compressed together, in order to achieve a better compression ratio.

The key advantage of fossil with respect to other scms is that at the
core level a fossil repository is *unordered*. This means that we are
free to order the artifacts in a way that compressed together we may
obtain a better compression ratio.

This, of course, is a NP problem... the so-called "Travelman Salesman
Problem" (tsp)... but there are some heuristics. I have found this
page which points to other papers:

http://www.paul.sladen.org/projects/compression/

expecially at the chapter

"Pre-compression optimisation of tar archive ordering using clustering
methods"


Of course storing a fossil of fossils does degrade performance because
the fossil must be decompressed...

but on the other hand a user may want to "fossilize" the ancient
history of a project... that is to store it efficiently at the cost of
a higher access time.

For example we may have a command called "fossilize" which may used in
this way:

fossil fossilize --before 2005

In this case fossil will take all the artifacts which exists before
2005, it deconstructs them on disk, it compresses them using
"in-file-similarities", it creates a sort of super-fossil (indexed in
some way) and stores this super-fossil into the database.

Now the history of the project before 2005 is present, albeit in a
"slow" form.

The new artifacts, after 2005, will be stored in the database ad they
are now.


Well, these are only random sunday thoughts, just to share some
ideas...

Lino





_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Reply via email to