Re: [sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?

Pavel Ivanov Mon, 01 Mar 2010 08:38:42 -0800

> Anyway, it seems the OP has a point in saying that it would be nice
> --and I would say 'natural'-- to have the optimizer enhanced to handle
> "NOT <condition>" as efficiently as it handles "<condition>, provided
> such enhancement can be done with only little changes.

I totally disagree with you. Let's say you have 1,000,000 rows and 100
of them contain NULL. In this situation selecting NOT NULL will select
almost all rows which means that using index in this case doesn't give
any performance boost. So here using full scan for NOT NULL condition
is better and for NULL condition using index is better. Everything is
completely legitimate.

The only bad thing here is that functions like max(), min() and
count() are not special functions in SQLite like it is in some other
RDBMS. Because of this fact SQLite core doesn't know that for count()
it doesn't need the full row and if the table contains a lot of other
fields along with "text" then it will be indeed faster to use index on
"text" even for full scan as in example above. It's a historical
feature in SQLite and I doubt it can be easily changed anywhere in
near future.

BTW, when SQLite uses index on "text" field it needs to load full
values of "text" field anyway. So maybe OP meant something else?

Pavel

On Mon, Mar 1, 2010 at 11:24 AM, Jean-Christophe Deschamps
<j...@q-e-d.org> wrote:
>
>> > "SELECT count(*) WHERE NOT text IS NULL"
>> >
>> > requires that the complete text column is loaded. With a stored LOB
>> > this results in crazy performance.
>>
>>How did you find that? What do you mean by "requires loading of the
>>whole text column"? It pretty much can require even loading of text
>>columns that shouldn't be counted at all just because one database
>>page is the minimum storage entity loaded from disk.
>
> Hi Pavel,
>
> I believe the OP was intrigued/upset by the fact that
>
>     SELECT count(*) WHERE NOT text IS NULL;
> or (equivalent)
>     SELECT count(*) WHERE text IS NOT NULL
>
> does not use an index in the text column, while
>
>     SELECT count(*) WHERE text IS NULL
>
> does use the index.
>
> I've shown a (trivial) way to achieve the same 'not null' count using
> the index.
>
> Anyway, it seems the OP has a point in saying that it would be nice
> --and I would say 'natural'-- to have the optimizer enhanced to handle
> "NOT <condition>" as efficiently as it handles "<condition>, provided
> such enhancement can be done with only little changes.
>
> The optimizer is smart enough to handle multiple conditions connected
> by AND and OR and use index for every condition (when they are
> available, of course), but it reverts to full scan for any NOT
> <condition>, whatever condition is (simple or complex).
>
> I'm certainly not in a position to dictate how should the optimizer
> should evolve but, as a mere user, I feel that situation a little less
> than satisfactory.
>
>
>
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?

Reply via email to