Thank you - as I suspected. Query #2 was improved only with the creation of an index on 'someString'
________________________________ From: Mike Matrigali <[email protected]> To: Derby Discussion <[email protected]> Sent: Wednesday, September 9, 2009 3:59:11 PM Subject: Re: Table of 13 rows still needs an index to make it performant? T K wrote: > I have a table with a blob and 13 rows in it; in the following, please notice > the execution times: > > 1) Selecting the primary key: > SELECT primkey from sometable > Query 1 of 1 elapsed time (seconds) - Total: 0.032, SQL query: 0.016, > Building output: 0.016 > > 2) Select the primary key with WHERE clause: > SELECT primkey from sometable WHERE someString = 'someValue' > Query 1 of 1 elapsed time (seconds) - Total: 0.843, SQL query: 0.828, > Building output: 0.015 > > 3) Here's how many rows I have: > select count(*) from sometable > 13 > > Does the sheer fact that the table has a blob, albeit not selected, affect > query execution THAT much? Any suggestions to improve it? > > Thanks > > > Derby stores blobs in the same space as the other data, and without indexes the only way that derby can find the "next" row is by reading through all the pages after the current row. It stores indexes in separate files where the structure of the index lets it do direct lookups in the base table. The Blobs are stored as linked lists of pages. So performance is dependent on how the rows get laid out in the table. The worst case would be 1 row on page one followed by 2 gig of blob data, and then followed by another row. When an index is used Derby can jump directly to the page containing the beginning of the row. And when processing a single row it need not read the blob if it is not part of the query. In the case of query 1 it probably never even looked at the base table, just used the index which given the number of rows is probably a one page cached index. For query 2 it would be interesting to know if it is picking to use the index to do probes or not. printing the query plan for both would tell exactly what it is doing.
