-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 02/02/2011 11:48 AM, Bert Nelsen wrote:
> Because I felt so stupid about these mostly empty columns taking so much
> space, I tended to replace all the "phone" columns by a single column named
> "customerPhone".
> I stored the values into customerPhone like that:
> 
> <cpp=><cpc=><cpm=93829343><cpw=><cp1=234928734><cp2=>

You want to know something hilarious?  Your space saving mechanism uses
*more* space than the empty columns would consume.  By my calculations 150%
additional space for the example above!

Some databases have a storage scheme where if you declare a column
"char(256)" then 256 bytes of storage are consumed even if you store nothing
in that column.  If you are used to that then empty columns would indeed add
up to a lot.

SQLite does not work that way.  Look up its type affinity documentation
(link below) to see.  Every record has a header followed by the values for
that record.  The header entries indicates the type (and for blob/string the
length) of the corresponding value.

A record with 100 columns where every single one contains NULL or a zero
length string would occupy 100 bytes of record header and zero bytes of
record values.  Note that an old school database would be substantially
similar even if it recorded the type information for the table rather than
the record since it would still need a per record way of indicating whether
a value is null versus the declared type.  (ie it would need at least one
bit per value, so in this case would need around 16 bytes.)

Using your example above, you have 6 columns.  Storing them as 6 columns in
SQLite with the two numbers would occupy 22 bytes for both the record header
and 2 non-empty values.

Storing it as one column with that hackery consumes 54 bytes.  Heck your
empty storage is 37 bytes compared to 6 empty SQLite columns being 6 bytes.
 The only time your scheme would actually save space is if all values were
empty and you stored a zero length string in which case you'd use 1 byte for
the column instead of 6 bytes for 6 empty ones.

The important lesson here is you should always do measurements first to
confirm your beliefs as they may not actually be stupid!  And secondly as
everyone else pointed out, you aren't the first person wanting to store lots
of information in a database and there are best practises such as
normalization that help address your needs if you seek them out.  It is
never stupid to ask!  (Or read a book, or web sites etc)

SQLite type information:

  http://en.wikipedia.org/wiki/Database_normalization

SQLite file format - record:

  http://www.sqlite.org/fileformat.html#record_format

Schema normalization:

  http://en.wikipedia.org/wiki/Database_normalization

SQLite books:

  http://www.sqlite.org/books.html

Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEUEARECAAYFAk1KCYcACgkQmOOfHg372QRC1gCfbaE7tCvNKFKNkq9N14tSTOzh
S30Al3Tosxpwa4qPvzqXFxeNptxfwPs=
=Jrag
-----END PGP SIGNATURE-----
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to