Re: [sqlite] More columns vs. several tables

Dennis Cote Tue, 02 May 2006 08:54:01 -0700

Felix Schwarz wrote:

Hi,
I have to decide on a table layout for storing the data of myupcoming project.
Yesterday I have made my way through the excellent presentation athttp://www.sqlite.org/php2004/page-036.html and read the sentence
"Put small and frequently used columns early in the table to minimizethe need to follow the overflow chain."
Now, that's interesting! And I'm wondering whether there is a bigperformance hit for a simple
    SELECT binarydata FROM entries WHERE somehash = 27817298;

when I use

    CREATE TABLE entries(
           entry_id INTEGER PRIMARY KEY,
           somehash  INTEGER,
           property1 INTEGER,
           property2  VARCHAR(255),
           property3  VARCHAR(255),
           binarydata BLOB
    );
instead of splitting the binary data (around 40K each) into twotables like this:
    CREATE TABLE entries(
           entry_id INTEGER PRIMARY KEY,
           somehash  INTEGER,
           property1 INTEGER,
           property2  VARCHAR(255),
           property3  VARCHAR(255),
           binary_id INTEGER
    );

    CREATE TABLE binaries(
           binary_id INTEGER PRIMARY KEY,
           binarydata BLOB
    );

and then use a select of this form:
SELECT binarydata FROM binaries WHERE binary_id = (SELECTbinary_id FROM entries WHERE somehash = 27817298);
Also, could the usage of VIEWs speed up the SELECTing of data in thesecond example? Or does it just use SELECTs under the hood itself,i.e. without any caching of data?
Thanks in advance for any feedback.

Felix

Felix,

I don't think you will see much difference between the two layouts. Youhave your search field, somehash, located early in your records soSQLite does not need to follow the overflow chain to read its value.

The split tables should actually take a little longer because afterSQLite has found the correct hash value, it must locate the binary inthe binaries table using the binary_id from the entries table. Thisoperation will be relatively fast, but is not gaining you anything.

The hint from the web page is pointing out the added time it takes toscan the overflow chain to find the value of a field located after alarge field in a record. Your 40K binary fields will be split intomultiple overflow blocks in either table.

To speed this query you would be much better off adding an index on thesomehash field. Then SQLite could use the index to quickly locate thecorrect record in your entries table. It will only scan the overflowchain for queries that return the value of the binarydata field.


HTH
Dennis Cote

Re: [sqlite] More columns vs. several tables

Reply via email to