Dennis Cote <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] wrote: > > Dave Gierok <[EMAIL PROTECTED]> wrote: > > > >> It looks like the size of a Sqlite DB ends up being much larger > >> (more than 2x) than size that I calculate for its data set. > >> > >> A simple test shows that when creating one table with one integer > >> column and filling it with 10000 rows, I get a DB size of 92KB > >> instead of what I'd expect to be around 40KB plus some small > >> overhead for the table definition. This seems to scale linearly > >> as I increase the amount of data in the DB. > >> > > > > SQLite stores 64-bit integers, not 32-bit as you suppose. And > > each row also stores a 64-bit integer rowid in addition to the > > data. So that it fits in 92KB instead of the (naively expected) > > 160KB suggests that SQLite is actually doing a reasonable job of > > compressing the data. > > > I hate to disagree with the author, but that description is not quite > accurate. :-) > > SQLite uses variable length integer storage...
No. I'm going to stand by what I said. SQLite works with 64-bit integer values. When writing those values to the disk, various compression techniques are used to avoid having to take up 8 bytes of disk space in the common case where most of those bytes are going to be zero. Various encodings are used. All of them are Huffman codes over a fixed probability distribution. Dennis calls these "variable length integers". I call them integers that are compressed using a Huffman code. That's the same thing in practice. But the nomenclature is important because I can point to Huffman's PhD thesis in 1952 to prove that the on-disk representation of integers in SQLite is not patentable. -- D. Richard Hipp <[EMAIL PROTECTED]> ----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------