RE: [sqlite] SQLite and Columnar Databases

Darren Duncan Thu, 13 Dec 2007 17:22:34 -0800

At 7:59 AM -0500 12/13/07, Tom Briggs wrote:

 > Something I will say about this, for people who don't know, is that

 this columnar thing is strictly an implementation detail.  While


   I think that this is an oversimplification.  That's somewhat like
saying that the way you use a sledge hammer is no different than how you
use a claw hammer, because they're both hammers.  Anyone who tries to
hang a picture with a sledge hammer will be rather unhappy with your
advice.

   Though at some level how the data is stored is indeed an
implementation detail, to take full advantage of the fact that it is
requires re-thinking schema design and in some cases even query design.
See

http://www.fulltablescan.com/index.php?/archives/44-Compression-as-Normalization...-Kinda.htmlfor more info.

If you read further into your own link, including the comments,you'll see my case also being made.


I don't see your claw hammer / sledge hammer argument being applicable here.

The relational model of data is all about letting users defining thelogical/possible structure of their data, with a focus on the meaningof the data relative to other data. It is intended to make datamanipulation and queries both easy to use and logically rigorous.The schemas that users design, including their so-callednormalization traits, are just for reflecting the meaning of theirdata in a logical manner.

This model is completely removed from how the data is physicallystored, eg whether in rows first or in columns first, and thephysical store is determined just by the DBMS behind the scenes, andhence is an implementation detail. The DBMS can arrange how it likesin order to satisfy the logical model in a way that performs well.The users do not contort their schemas beyond what is logically cleanin order to gain performance; having a clean schema will let the DBMSinfer this automatically.

According to some comments, Vertica (a column-store maker) is makingthe same case that I am, which is just to have a logical cleanschema, and performance benefits will automatically follow from that.

I will also note that the highest level of schema normalization, 6thNormal Form, essentially puts every non-key column in its own table.

I will also note that a column-based store essentially works like aheavily indexed row-based store, in which there is an index on everykey or every column, and so all searches, which includes those onwhich joins are performed, can/do look in what is otherwise indexes.This is potentially slower for updates (or maybe not), but can befaster for queries, depending on circumstances.


-- Darren Duncan

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

RE: [sqlite] SQLite and Columnar Databases

Reply via email to