>Importing a 25955 line contents file (2.1 megs) resulted in a 2.3 meg
>database.  The c_files table had 25955 records, c_pkgs had 193 records,
>and c_match had 25759 records.  Theoretically, c_match should have had
>25955 records, so I'm not 100% sure what happened.  I'll have to do some
>more digging in my code.

I'm surprised at the small size overhead.  Our own attempts were
not that successful.


Bill asked me about my experiment; it was something like this;
I build a small door based server which loaded the database in-memory.

The principle behind this was that all package transaction would be against
the door server and not the contents file (easy to manage when all
you use are the *pkg* tools during install time).

Rather than rewriting the file, the daemon would output with equal
transactional safety a log file of deltas which simply contained removed
or changed contents lines, all with a one byte prefix saying which is
which.  The daemon could occassionally write out the new contents
file.  If the daemon was terminated at any point in time, the data
would all be recoverable from the logfile and the old contents file.

The purpose written in-memory hash was fairly efficient (a small multiplier
over the file size) as opposed to our first attempt which consumed
100s of megabytes)

The benefit of the approach is that it keeps the contents file (and
we found many bits depending on it) but do away with the huge amount of
I/O assiociated with it.

Casper

Reply via email to