Good stuff!  Is this in the wiki somewhere?

On 6/12/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

"Adler, Eliedaat" <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I want to add an index on a sparsely populated column -
>  i.e. for most rows the column is null.
>
> 1. Would this index be very compact? i.e. only include "set" rows.

No.  Entries appear in the index for NULL values too.

> 2. Could this index also help select NULL entries?
>

The index will be used if you say  "WHERE x IS NULL"  (assuming "x"
is the column that is indexed.)  But this might be counter-productive.
If most of your entries are NULL, and you say "WHERE x IS NULL" then
SQLite will do a binary search for each entry and so the runtime will
be O(KlogN) where K is the number of rows in the result set and N is
the total number of rows in the table.  If you do not use the index
(if you say "WHERE +x IS NULL") then the runtime will be O(N).  If
N and K are similar in magnitude (which will be the case if most of
your entries are NULL), then the query will run faster if it does not
use the index.

The previous paragraph is not the whole story, actually.
If you have two columns in the index:

   CREATE INDEX idx1 ON tbl1(x,y);

And your query only uses indexed columns:

   SELECT y FROM tbl1 WHERE x IS NULL;

Then that query is O(K) in time and is thus very fast, regardless
of how many entries have a NULL x.  But if you try to use a column
of tbl1 that is not indexed, like this:

   SELECT z FROM tbl1 WHERE x IS NULL;

Then you get the O(K*logN) behavior described by the first paragraph
above, which can be slow if K is of the same order of magnitude as N.

--
D. Richard Hipp <[EMAIL PROTECTED]>




-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]

-----------------------------------------------------------------------------


Reply via email to